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A METHOD FOR GENERATING ENGINEERED CELLS FOR LOCUS SPECIFIC 
GENE REGULATION AND ANALYSIS 

CROSS-REFERENCE TO RELATED APPLICATIONS 

[0001] This Application claims the benefit of U.S. Provisional Application No. 
60/349,565, filed January 18, 2002, the disclosure of which is incoiporated herein by reference 
iQ its entirety. 

TECHNICAL FIELD OF THE INVENTION 

[0002] The invention is related to the area of homologous recombination in eukaryotic 
cells for studying gene function, gene expression, and generating over-producer clones for 
high protein production. In particular it is related to the field of flierapeutic target discovery, 
pharmacologic compound screening and protein manufacturing. 

BACKGROUND OF THE INVENTION 

[0003] The use of specific gene targeting in eukaryotic cell-based model systems provides 
an effective and selective strategy for studying the fimction of a particular gene in response to 
biological or chemical molecules as well as for model systems to produce biocliemicals for 
therapeutic use. In particular is the use of homologous recombination to: (1) inactivate gene 
function to study downstream functions; (2) introduce reporter gene molecules into targeted 
loci to faciUtate the screening of gene expression in response to biomolecules and/or 
pharmaceutical compounds; (3) generate stable, steady-state expression of target genes via the 
introduction of constitutively active heterologous promoter elements or through chromosomal 
site-specific gene amplification. 

[0004] Standard methods for introducing targeting genes to a locus of interest are known 
by those skilled in the art. Gene targeting m prokaryotes and lower organisms has been well 
estabUshed, and methods for in vivo gene targeting in animal models have also been described 
(de Wind N. et al. (1995) "Inactivation of the mouse Msh2 gene results in mismatch repair 
deficiency, methylation tolerance, hyperrecombination, and predisposition to cancer" Cell 
82:321-300). 
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[O005] The generation of knockouts in somatic cells, however, is more problematic due to 
low efficiency of transfection and endogenous biochemical activities that monitor for DNA 
strand exchange. Work done by Waldman et at (Waldman, T., Kinzler, K.W., and 
Vogelstein, B.(1995) Cancer Res. 55:5187-5190) demonstrated the ability to generate somatic 
cell knockouts in a human cell line caUed HCT116 at relatively high rate. In the described 
studies, the authors used a targeting vector containing the neomycin (neo) resistance gene to 
knockout a locus of interest. Using this cell line the authors reported 37% of the neo resistant 
clones tested were found to contain a targeting vector within the homologous locus in the 
genome of the host. 

[O006] Similar studies using other cell Knes by these authors have been less successful. 
While Ihe reason(s) for the lack or significant reduction in the frequency of recombination in 
somatic cell lines are not clear, some factors, such as the degree of transfection as well as the 
differences that may occur within the intracellular miUeu of the host may play critical roles 
with regard to recombination efficiency, hi the studies performed by Waldnian et al, the cell 
line that the authors used was mherently defective for mismatch repair (MMR), a process 
involved in monitoring homologous recombination (de Wind N. et al. (1995) Cell 
82:321-300). One proposed method for the high degree of recombination in this line was the 
lack of MMR, which has been impUcated as a critical biochemical pathway for monitoring 
recombination (Reile, TE et al WO 97/05268; Rayssigguier, C, et al (1989) Nature 
342:396-401; Selva, E., et al (1995) Genetics 139:1175-1188; U.S. Patent No. 5,965,415 to 
Radman), Indeed, studies using mammaUan and prokaryotic cells defective for MMR have 
previously demonstrated the increased chromosomal recombination v^dth DNA fragments 
having up to 30% difference in secfuence identity, 

[0007] Nevertheless, homologous recombination in mammalian somatic cell lines has 
been and remains problematic due to the low efficiency of recombination. Although it is 
believed by many skilled in the art that low rate of homologous recombination may be 
overcome by the blockade of MMR (Reile, TE et al WO 97/05268; Rayssigguier, C, et al 
(1989) Nature 342:396-401; Selva, E., et al (1995) Genetics 139:1175-1188; U.S. Patent No. 
5,965,415 to Radman; Beth EUiott and Maria Jasin, 'Tlepair of Double-Strand Breaks by 
Homologous Recombination in Mismatch Repair-Defective MammaUan Cells" (2001) Mol 
Cell Biol, 21:2671-2682) these methods teach the use of using MMR defective unicellular 
organisms to increase homologous recombination. A significant bottleneck to this approach is 
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the need to clone large segments of homologous DNA firom the target locus. Moreover, while 
it has been reported that short oligonucleotides are capable of homologously recombirdng at 
site-specific regions of the genome (Igoucheva O, Alexeev V, Yoon K., (2001) "Targeted gene 
correction by small single-stranded oligonucleotides in mammalian cells" Gene Ther. 8:391- 
399), the ability to integrate larger fi-agments with short terminal regions of homology remains 
elusive. In fact, recent studies by Inbar et al. (hibar O, Liefshitz B, Bitan G, Kupiec M., 
(2000) "The Relationship between Homology Length and Crossing Over during the Repair of 
a Broken Chromosome" J. Biol. Ckem. 275:30833-30838) demonstrated that fragments that 
contMned only 123 bps of homologous sequence were not sufficient to induce homologous 
exchange of large DNA fragments in yeast. It has not been heretofore demonstrated that 
larger DNA fragments, such as those containing regulated or constitutively active promoter 
elements, gene inserts or reporter genes could be integrated into the exon of a locus in somatic 
mammalian cell lines with short, homologous terminal ends, such as fragments of only 20-120 
nucleotides. 

SUMMARY OF THE INVENTION 

[0008] The ability to generate site-directed "knock-ins" in eukaryotic cells, in particular 
mammalian cells, used for drug screening or development of custom cell lines for constitutive 
gene expression is of great value for pharmaceutical drug product development as well as for 
compound screening. Compounds can be of a low molecular weight, a complex 
macromolecule or protein. The compound can be targeted to a gene of interest whose 
expression is altered either positively or negatively by directly or indirectiy affecting the 
activity of promoter and/or enhancer elements that are involved in regulating the expression of 
a specific gene locus. One method taught in this apphcation is the "knock-in" of constitutively 
active promoter elements (such as but not limited to viral promoters, i.e. SY40 early or late 
promoters, CMV, LTR, etc. or promoters from constitutively expressed housekeeping genes 
such as the elongation factor or actua) into a desired locus. The abihty to direct constitutive 
gene expression from a host organisms genome may lead to the establishment of cell lines 
such as but not limited to those that overproduce therapeutic targets for drug binding studies, 
gene function studies as well as lines that overproduce therapeutic proteins for product 
manufacturing applications. 
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[0009] It is an object of the present invention to teach the process of rapidly generating 
gene-targeting fragments for eukaryotic cells, in particular somatic mammaHan ceUs that can 
result in the site-specific chromosomal targeting of regulatory sequences that can alter 
endogenous gene expression of a given locus for function studies and gene product 
production. In addition, it is another object of the inveution to teach the process of rapidly 
generating gene targeting fragments for eukaryotic cells that are capable of targeting a single 
exon of a chromosomal locus with a marker that can be used for monitoring gene expression 
to elucidate gene function with respect to disease and to monitor gene expression of a given 
locus in response to biological and pharmacological agents. It is another object of the 
invention to teach the process of generatmg locus-specific targeting fragments containing the 
dihydrofolate reductase (DHFR) gene for rapid, site-specific chromosomal integration and 
site-specific gene amplification as a tool for eiJiancing protein production for development 
and/or manufacturing apphcations. 

[0010] The invention provides methods for introducing a locus specific targeting fragment 
into the genome of a cell through homologous recombination comprismg: inhibiting 
endogenous mismatch repair of the cell; mtroducing a locus specific targeting fragment into 
the cell; wherein the locus specific targeting fragment is a polynucleotide comprising at least 
one promoter, a selectable marker and 5' and 3' flanking regions of about 20 to about 120 
nucleotides; wherein the 5' and 3' flanking regions are homologous to a selected portion of the 
genome of the cell; and wherein the locus specific targeting fragment integrates into the 
genome of the cell by homologous recombination. 

[0011] The invention also provides methods for genetically altering a cell to overproduce a 
selected polypeptide comprising: inhibiting endogenous mismatch repak of the cell; 
introducing a locus specific targeting fragment into the cell; wherein the locus specific 
targeting fragment is a polynucleotide comprising at least one promoter sequence, a selectable 
marker and 5' and 3' flankuig regions of about 20 to about 120 nucleotides, wherein the 5' and 
3' flanking regions are homologous to a selected portion of the genome of the cell, and 
wherein the locus specific targeting fragment integrates into the genome of the cell by 
homologous recombination; and selectmg the cell that overproduces the selected polypeptide. 
[0012] The invention also provides methods for tagging an exon of a cell for screening 
gene expression in response to biochemical or pharmaceutical compounds comprising: 
inhibiting endogenous mismatch repair of the cell; and introducing a locus specific targeting 
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fragment into the cell; wherein the locus specific targeting fragment is a polynucleotide 
comprising a reporter element, a selectable marker and 5' and 3' flanking regions of about 20 
to about 120 nucleotides, wherein the 5' and 3' flanking regions are homologous to a selected 
portion of the genome of the cell; wherein the locus specific targeting fragment integrates 
within a targeted gene's exon by homologous recombination; and wherein the cells containing 
genes with tagged exons are used for screening gene expression in response to biochemical or 
pharmaceutical compounds. 

[0013] The invention also provides methods for tagging a specific chromosomal site for 
locus-specific gene amplification comprising: inhibiting endogenous mismatch repair of the 
cell; and introducing a locus specific targeting firagment into the cell; wherein the locus 
specific targeting fr^ment is a polynucleotide comprising, operatively linked: a dihydrofolate 
reductase gene, a promoter, and 5' and 3' flanking regions of about 20 to about 120 
nucleotides, wherein the 5' and 3' flanking regions are homologous to a selected portion of the 
genome of the cell; wherein the locus specific targeting fitigment integrates into the genome of 
the cell by homologous recombination; and wherein the specific chromosomal site is tagged 
for locus specific gene amplification. 

[0014] In some embodiments of the method of the invention, the method further comprises 

restoring mismatch repair activity of the cell. 

[0015] In some embodiments of the methods of the invention, the promoter may be a 
CMV promoter, an SV40 promoter, elongation factor, LTR sequence, a pDSTD promoter 
sequence, a tetracycline promoter sequence, or a MMTV promoter sequence. 
[0016] In some embodiments of the methods of the invention, the selectable marker may 
be a hygromycin resistance gene, a neomycin resistance gene or a zeocin resistance gene. 
[0017] In some embodiments of the methods of the invention, the 5 'and 3' flanking 
regions are about 30 to about 100 nucleotides in length. In other embodimaits of the methods 
of the invention, the 5'and 3' flanking regions are about 40 to about 90 nucleotides in length. 
In other embodiments of the methods of the invention, the 5' and 3' flanking regions are about 
50 to about 80 nucleotides in length. In other embodiments of the methods of the invention, 
the 5' and 3' flanking regions are about 50 to about 70 nucleotides in length. 
[0018] In some embodiments of the methods of the invention, the cell may be a vertebrate 
cell, an invertebrate ceU, a mammaUan cell, a reptihan cell, a fungal cell, or a yeast cell. 
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[00191 In some embodim^,. of to methods of the invention, the 5-and 3' flato^ 
regions aie homologous to a 5' flanldng region of a selected chromosomal locus of the cell " 
[0020, Jn some emiodiments of the methods of the invenUon, (he mtenateh .^air is 
mhrbtted by introducing into te cel. a dominant negative aUele of a mismatch repair gene In 
other embodiments, mismatch repair is irMbited using a chemical inhibitor of mismatch 
repair. In embodiments using a dominant negafive aUele of a mismatch repair gene, the aUete 
may be a dominant negative fomi of aPMK (SBQ ID N0:2 and SEQ ID N0:4) PMSl (SEO 

IDNO:6,.M9^.(SEQIDNO:8),^/75(SEQIDNO:41).Affiffi(SEQIDNO:10)PMj; 
(SEQ ID NO:43), or a PMSR3 (also to™ as PMSL^ (SEQ ID NO:45). hr some 
embodmrents, the dominant negative fonn of the PMS2 gene is a PMS2-134 gene (SEQ ID 
N0:12), ^PMSR2 gene (SEQ ID NO:43), or aPH/SRJ gene (SEQ ID NO-45) 
[00211' Some embodiment, of the method may comprise a polynucleotide that also 
compnses a reports element, includmg. but not lintited to a fomr of luciferase or a green 
flm^ protein. In some embodiments, the reporter element is fused in ftame to flra 
selectable marker. 

[00221 In some embodiments, the locus specific targeting ftagment Anther comprises a 
selectable marker and a second promoter operatively Imked to the selectable marker 
[00231 The invention also provides locus specific targeting ftagments comprising- a 
drhydrofolate reductase gene operatively linked to apn^otet, and 5' and 3' flanking regions 
of about 20 to about 120 nucleotides wherem to 5' and 3- fl^ se-jnenoes are homologous 
to a selected portion of a genome of a cell. 

(00241 The mvention also provides locus specific targeting fiagments comprismg- a 
reporter element, a selectable marker operatively linked to a promoter, and 5' and 3- flankhrg 
regions of about 20 to about 120 nucleotides. 

[00251 tte invention also provides locus specific targeting fiagments comprising: at least 
one promote sequence, a selectable marker and 5' and 3' flankmg regions of about 20 to 
about 120 nucleotides. 

[00261 m some embodiments of the compositions of the invention, to locus specific 
targeting ftagment ftrflrer comprises a selectable marker openttively linked to a second 
promoter science. Tire compositions may iWhc. comprise an IRES between two 

protem ^coding sequences such as between a dihydmfolate reductase gene »d a sefectable 
marker, for example. 
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[0027] In some embodiments the 5' and 3' flanking regions of the locus specific targeting 
sequence are about 30 to about 100 nucleotides in length. In other embodiments the 5' and 3' 
flanking regions of the locus specific targeting sequence are about 40 to about 90 nucleotides 
in length. In other embodiments the 5' and 3' flanking regions of the locus specific targeting 
sequence are about 50 to about 80 nucleotides in length. M other embodiments the 5' and 3' 
flanking regions of the locus specific targeting sequence are about 50 to about 70 nucleotides 
in length. 

[00281 The invention also provides methods for producing a locus specific targetmg 
flragment comprising amplifying a nucleic acid construct comprising a promoter and a 
selectable marker with a 5' and 3' primer in a polymerase chain reaction, wherein the 5' 
primer comprises about 20 to about 120 nucleotides that are homologous to a portion of the 
geuome of a cell positioned 5' of a target locus, and wherein the 3' primer comprises about 20 
to about 120 nucleotides that are homologous to aportion of the genome of a cell positioned 3' 
of the target locus. 

[0029] In some embodiments of the method of the invention, the nucleic acid construct 
further comprises a second protein encoding sequence operatively finked to a second 
promoter. In some embodiments, the second protein encoding sequences is a dihydrofolate 

reductase sequence. 

[0030] In some embodiments, the method Anther comprises the step of selecting the cells 
based on resistance to methotrexate. In some embodiments, the locus specific targeting 
fi^igment further comprises an operatively positioned locus control region. 
[0031] The invention also provides methods for introducing a locus specific targeting 
fragment into the genome of a cell through homologous recombination comprising: 
mtroducing a locus specific targeting fragment into a mismatch repair-deficient cell; wherein 
the locus specific targeting firagment is a polynucleotide comprising a nucleic acid sequence to 
be incorporated mto the gmome of the mismatch repair deficient cell; wherein the 
polynucleotide comprises portions of about 20 to about 120 nucleotides, each flanking the 5' 
and 3' portion of the nucleic acid sequence to be incorporated into the genome; wherein the 5' 
and 3' flanking regions are homologous to a selected portion of the genome of the cell; and 
wherein the locus specific targeting fragment integrates into the genome of the mismatch 
repair deficient cell by homologous recombination. 
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[0032] The invention described herein is directed to the use of a process for the rapid 
generation of locus specific targeting fragments (LSTFs) that axe capable of integrating within 
a given locus, to regulate the expression of a specific gene locus in a host cells for product 
manufacturing, studying gene function, and/or expression profiKng gene expression under 
homeostatic, pathogenic, or environmentally altered conditions. Promoter targeted eukaiyotic 
cell hnes are generated by using 50-150 nucleotide (nt) primers whereby the 3' termini of each 
primer (last 30 nts) are specific for flie 5' and 3' end of a plasmid cassette containing a 
expression element (i. e., constitutive promoter) juxtaposed to a constitutively expresse4 
selectable marker gene neomycin-, hygromycin-iesistant. ete., gene). The 5' sequence 
(20 to 120 nts) of each primer preferably contains 100% homology to the chromosomal target 
area of interest. In the case of generating tagged exons within a targeted locus, a similar 
method is employed as above, except that the cassette contains a reporter element such as, but 
not limited to, firefly luciferase (shown by nucleic acid sequence, SEQ ID NO:35, and amino 
acid sequence, SEQ ID NO:34), green fluorescent protein (shown by nucleic acid sequence, 
SEQ ID NO:37, and amino acid sequence, SEQ ID NO:36), bacterial luciferases; Renaia 
luciferase (shown by nucleic acid sequence, SEQ ID NO:39, and amino acid sequence, SEQ 
ID NO:38), a bifimctional ruc-gip cliimera (comprising a cDNA for Renilla luciferase (rue) in- 
frame with a cDNA encoding the "humanized" GFP (g^) from Aequorea (Wang et al. (2002) 
Mol. Genet. Genomics 268(2): 160- 168)), and the like, fused in-frame to a selectable marker 
for selection. Finally, LSTFs can be used to deliver a DNA fragment encoding a 
constitutively expressed dihydiofolate reductase gene (DHFR) juxt^osed to a constitutively 
expressed selection marker into a specific chromosomal site. Upon integration of tiie DHFR- 
LSTF. cells can be chemically selected for locus amplification via drug resistance using 
methods know by those skilled in ttie art. which in turn will result in amplification of a gene 
locus and potentially over expression of its encoded gene product. 

[00331 The homologous recombination of small overlapping DNA regions is difficult to 
achieve, however, it is taught by this appUcation that the use of inhibiting mismatch repair 
CMMR) in eukaryotic somatic ceUs increases the efficiency of homologous recombination that 
allows for the rapid generation of recombination using homologous regions as short as 20 25 
30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, or 120 nucleotide ir^ 
length. In some embodiments, the homologous regions are as short as about 25 to about 1 15 
nucleotides in length. In other embodiments, the homologous regions are as short as about 30 
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to about 110 nucleotides in length. La other embodiments, the homologous regions are as short 
as about 35 to about 105 nucleotides in length In other embodiments, the homologous regions 
are as short as about 40 to about 100 nucleotides in length. In other embodiments, the 
homologous regions are as short as about 45 to about 95 nucleotides in length. In other 
embodiments, the homologous regions are as short as about 50 to about 90 nucleotides in 
length, bi other embodiments, the homologous regions are about 50 to about 85 nucleotides in 
length. In other embodiments, the homologous regions are about 50 to about 80 nucleotides in 
length. In other embodiments, the homologous regions are about 50 to about 75 nucleotides in 
length. In other embodiments, the homologous regions are about 50 to about 70 nucleotides in 
length. 

[0034] The inhibition of MMR in such hosts can be achieved by using dominant negative 
mutant MMR genes as described (Nicolaides, N.C. et al. (1998) "A naturally occurring 
bPMS2 mutation can confer a dominant negative mutator phenotype" Mol. Cell. Biol. 
18:1635-1641; U.S. Patent No. 6,146,894 to Nicolaides et al) or through the use of chemicals 
that can inhibit MMR of a host organism. Once the targeting vector is introduced, MMR is 
restored by removal of the dominant negative allele or removal of the MMR mhibitor and 
hosts are selected for integrated fragments by selection of the appropriate marker gene. 
[0035] The use of somatic eukaryotic cells containing knocked-m expression control 
elements or exon-tags, or DHFR amplification units as taught by this appUcation, will 
facilitate studies on elucidating unknown gene function by the ability to over express genomic 
loci at will xmder a variety of experimental growth conditions in the presence or absence of 
exogenous biological or pharmacological factors. Moreover, the use of such an approach to 
specifically tag a gene's exon will faciUtate flie profile of gene expression under certain 
growth conditions in wild type and pathogenic cells grown in the presence or absence of 
biological or pharmaceutical factors. Finally, the ability to specifically amplify chromosomal 
regions can facilitate enhanced protein production in a given host organism for discovery, 
development, and/or manufacturing or a given gene product. 

[0036] The invention described herein is directed to the creation of genetically modified 
eukaryotic cells, in particular, somatic mammalian cells containing targeted loci with regulated 
or constitutively active expression elements for the use in uncovering gene function or 
polypeptide production as well as the use of targeting vectors that can tag an exon of a locus 
which can subsequently be monitored in response to biological or pharmaceutical molecules. 
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The ability to generate such cens are facilitated by the use of targeting cassettes containing 
elements that are rapidly modified to target a given locus via PCR-mediated synthesis using 
locus specific primers containing 20-120 nts, specifically 50-70 nts, of homologous sequence 
to the chromosomal target site in combination with the use of agents that can block the 
endogenous MMR of the host during DNA integration to increase recombination efficiency of 
short homologous sequences (Nicholas Nicolaides, personal observation). 
[00371 The present invention describes the facilitated synthesis of gene targeting 
fragments for controlling getie expression from the chromosomal site within eukaiyotic cells 
as well as the use of exon-tagging fragments to study gene expression in the presence of 
biological or phannaceutical agents. The advantages of the present invention are further 
described in the examples and figures described herein. 

[0038] The present invention provides methods for generating somatic eukatyotic cells 
vdth altered gene expression profiles via homologous recombination in vivo, whereby gene 
expression is altered by the iiitegration of DNA sequences containing constitutive promoter 
elements and a selectable marker. One metliod for generating such a cell line is through the 
use of DNA fragments containing 20-120 nts of homologous temiinal sequences tliat are 
specific for a gene locus of interest in cells devoid of MMR. 

[00391 The invention also provides methods for generating somatic eukaiyotic cells 
containing genes with a tagged exon, whereby the cell is generated via the integration of DNA 
sequences contaiinng reporter elements fused to a selectable marker. One metiiod for 
generating such a cell Une is through the use of DNA fragments containing 20-120 nts of 
homologous tenninal sequence to a specific gene locus of interest in cells devoid of MMR. 
[00401 The invention also provides methods for generating genetically engineered somatic 
cell Imes that over produce polypeptides through Ihe use of promoter targeting fragments to 
chromosomal loci. 

[00411 The invention also provides methods for generating genetically engineered somatic 
cell lines that have a chromosomal site-specific integration of a constitiitively expressed 
DHFR gene through the use of locus targeting fragments to chromosomal loci for selection of 
ampUfied loci through chemical-induced gene amplification usmg methods known by those 
skilled in the art. 

[0042] Jn some embodiments, the invention provides methods for generating genetically 
altered cell lines that overproduce polypeptides for fimction studies. In other embodiments, 
-10- 
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the invention provides methods for generating genetically altered cell lines that overproduce 
polypeptides for production purposes. In other embodiments, the invention provides methods 
for generating genetically altered cell lines with genes v^hose exons are tagged for screening 
purposes. 

[0043] In some embodunents, the invention provides methods of enhancing the frequency 
of homologous recombination of a DNA fragment within a specific chromosomal locus m 
eukaryotic cells by blocking the MMR activity of the somatic cell host. 
[0044] In some embodiments, the invention provides methods of creating targeted 
eukaryotic cell lines with chromosomal loci contaimng DHFR expression vector for locus- 
specific gene amphfication. 

10045] These and other objects of the invention are provided by one or more of the 
embodiments described below. 

[0046] In one embodiment of the raveation, a method for making a somatic eukaryotic cell 
line MMR defective, followed by the introduction of a locus specific targeting fragment that 
results in the constihitive expression of a cliromosomal locus is provided. A polynucleotide 
encodmg a dominant negative allele of a MMR gene is introduced into a target cell. The cell 
becomes hypermutable as a result of the introduction of the gene. A targeting fragment is 
generated by PGR usiag primers containing sequences homologous to the chromosomal locus 
of interest. The fragment is introduced mto the host by transfection. Cell pools are then 
selected for clones with integrated fragments. Selected clones are fiirther analyzed by any 
number of means to assess expression and/or genome integration of a specific site. Upon 
confirmation of site-desired integration, MMR is restored m clones and the cells are useful for 
fiinctional studies or for generating high levels of protein for product development and/or 
manufacturing applications. 

[0047] In another embodiment of the invention, a cell line with a targeted exon is 
provided. A somatic eukaryotic cell hue is rendered MMR defective by introduction of a 
dominant negative MMR gene allele, followed by the introduction of a targeting fragment 
containing a reporter gene fhsed to a selectable marker that results in the tagging of an 
endogenous gene's exon is provided. A polynucleotide encoding a dominant negative allele of 
a MMR gene is introduced into a target cell. The cell becomes hypermutable as a result of the 
introduction of the gene. A targeting fragment is generated by PGR using primers contaimng 
sequences homologous to tiie chromosomal locus of interest. The fragment is infroduced into 
-11- 


BNSDOCltt <W O 0 3062435A1_I.> 


wo 03/062435 


PCT/US03/01361 


the host by tramfection. Cell pools are then selected for clones with integrated fragments. 
Selected clones are further analyzed by any number of means to assess expression and/or 
genome integi-ation of a specific site. Upon confirmation of site-desired integration. MMR is 
restored in clones and the cells are useful for functional studies to profile endogenous gene 
expression in the presence or absence of biological or phaimacological fectors. 
[00481 Yet in another embodiment of the invention, a cell line with a targeted locus is 
provided. A somatic eukaryotic cell line is rendered MMR defective by introduction of a 
dominant negative MMR gene allele, followed by the introduction of a targeting fragment 
containing a DHFR gene and a selectable marker that results in the specific tagging of a 
chromosomal site is described. A polynucleotide encoding a dominant negative allele of a 
MMR gene is mtroduced into a target cell. The cell becomes hypemiutable as a result of the 
introduction of the gene. A targeting fragment is generated by PGR using primers containing 
sequences homologous to the chromosomal locus of interest. The fragment is introduced mto 
the host by transfection. Cell pools aie then selected for clones with mtegrated fragments. 
Selected clones are fiirther analyzed by any number of means to assess expression and/or 
genome mtegration of a specific site. Upon confirmation of site-desired integration, cells are 
selected for methofrexate (MTX) resistance. MTX-resistant cells are then analyzed for 
chromosomal site ampKfication using any means useftil to those skilled in the art such as but 
not limited to genomic analysis by southern blot. RNA expression analysis or protem 
expression analysis. Upon successful amplification, MMR is restored in clones and the cells 
are useful for functional studies to profile endogenous gene expression in the presence or 
absence of biological or pharmacological factors as well as for production strains. 
[0049] These and other embodiments of the invention provide the art with methods that 
can rapidly generate gene targeted eukaryotic cells whereby the locus of interest can have 
altered expression profiles to study gene function and/or enhanced production levels for 
manufacturing. Moreover, the invention provides the art with methods to tag an exon of a 
gene that is useful for monitoring gene expression within a given host. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0050] Figure 1 shows a schematic diagram of promoter locus-specific targeting 
fragments (LSTF) and the genomic organization of a target gene. Primer Set A mdicates the 
primer position of the oUgonucleotides used to generate the LSTF for each gene that is usefiil 
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for genome analysis. Primer Set B indicates the primer position of oligonucleotides used to 
analyze each target gene to confirm locus specific integration. The box below each gene 
represents the LSTF, where the shaded areas represent the areas of homology to the target 
gene, whereby the homologous region is 50-70 nts in length. The black boxes in the gene 
diagram represents exons that are numbered with respect to homology to the target gene 
whereby sensitive RT-PCR can be used to assay for fusion sphced cDNAs consisting of CMV 
leader sequence located 3' to the CMV promote elements. The targeting cassette is used for 
generating constitutive expression from a eukaryotic host's genome, 

[0051] Figure 2 shows ejqpression of 6-globin in HEK293 cells transfected with LSTFs. 
RT-PCR analysis of RNA extracted from 293PMS134 cells fransfected Avith mock LSTF or 
Hyg-CMV fi-globin LSTF. Reverse transcriptase PGR was carried out using equal amounts of 
total RNA from each cell line and a 5' primer located in the leader sequence downstream of 
the CMV promoter (SEQ ID N0:21) and a 3' primer located in the coding region of the beta- 
globin gene (SEQ ID NO:25). PCR reactions were electrophoresed on 2% agarose gels, 
ethidium bromide stained and visualized using a UV light box. The arrow indicates a product 
of the expected molecular weight. 

[00521 Figure 3A shows the sequence of the fusion gene hygromycrn-green fluorescence 
binding protein for exon taggmg of somatic cells. The sequence in bold encodes for the 
hygromycin resistance gene, while the sequence in normal font encodes the green fluorescence 
binding protein. 

[0053] Figure 3B shows the sequence of the fusion gene hygromycin-luciferase for exon 
tagging of somatic cells. The sequence in bold encodes for the hygromycin resistance gene, 
while the sequence in normal font encodes the luciferase protein. 

[0054] Figure 4 shows a schematic diagram of exon locus-specific targeting fragments 
(LSTF) and the genomic organization of a target gene. The LSTF contains a selectable marker 
gene (i.e., hygromycin, neomycin, zeocin, etc.) that is in frame with a reporter gene, (i.e., 
luciferase, Green Fluorescent Protein, etc.). Primer Set A indicates the primer position of 
oligonucleotides used to analyze each target gene to confirm locus specific integration where 
the 5' primer is located in the exon preceding the targeted exon and the 3' primer is located 
proximal to the site of integration. The box below each gene represents the LSTF, where the 
shaded areas represent the areas of homology to the target gene, whereby the homologous 
region is 50-70 nts in length. The black boxes in the gene diagrams represent exons whereby 
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RT-PCR can be used to assay for fusion of spliced cDNAs consisting of the selectable marker- 
reporter cDNA within the targeted geae's encoded transcript. 

DETAILED DESCRIPTION OF THE INVENTION 

[0055] Various defmitions are provided herein. Most words and terms have the meaning 
that would be attributed to those words by one skilled m the art. Words or teons specificaUy 
defined herein have the meaning provided in the context of the present invention as a whole 
and as are typically understood by those skilled in the art. Any conflict between an art- 
understood definition of a word or tenn and a definition of llie word or term as specifically 
taught herein shall be resolved in favor of the latter. Headings used herein axe for convenience 
and ate not to be construed as limiting. 
[0056] As used herein, "MMR" refers to mismatch repair. 

[0057] As used herein, "inhibitor of mismatch repair" refers to an agent that interferes 
with at least one function of the mismatch repair system of a cell and thereby renders the cell 
more susceptible to mutation. 

[0058] As used herem, "hypemiutable" refers to a state in which a cell in vitro or in vivo is 
made more susceptible to mutation through a loss or impairment of the mismatch repair 
system. 

[0059] As used herein, "agents," "chemicals," and "inhibitors" when used in connection 
with inhibition of MMR refers to chemicals, oUgonucleotides, analogs of natural substrates, 
and the like that interfere with normal ftmction of MMR. 

[0060] The term "gene" is used herein to denote a DNA segment encoding a polypeptide, 

and includes genomic DNA (with or without intervening sequences), cDNA, and synthetic 

DNA. Genes may mclude non-coding sequences, including promoter elements. 

[0061] As used herein, "operably linked", when referring to DNA segments, indicates that 

the segments are arranged so that they function in concert for their intended purposes, e.g.. 

transcription initiates in the promoter and proceeds through the coding segment to the 

terminator. 

[0062] As used herein, the term "promoter" is used herein for its art-recognized meaning 
to denote a portion of a gene containing DNA sequences that provide for the binding of RNA 
polymerase and initiation of transcription. Promoter sequences are commonly, but not always, 
found in the 5' non-coding regions of genes. 
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[0063] As used herein, the term "promoter elements" is used to denote sequences within 
promoters that function in the initiation of transcription and which are often characterized by 
consensus nucleotide sequences. Promoter elements include RNA polymerase binding sites; 
TATA sequences; CAAT sequences; differentiation-specific elements (DSEs; McGehee et al. 
(1993) Mol. Endoa-inol. 7:551-560; cyclic AMP response elements (CREs); serum response 
elements (SREs; Treisman (1990) Seminars in Cancel- Biol. 1:47-58); glucocorticoid response 
elements (GREs); and binding sites for other transcription factors, such as CRE/ATF (O'Reilly 
et al. (1992) J. Biol Chem. 267:19938-19943), AP2 (Ye et al. (1994) /. Biol. Chejn. 
269:25728-25734), SPl, cAMP response element binding protein (CREB; Loeken (1993) 
Gene Expr. 3:253-264) and octamer fectors. See, in general, Watson et al. eds.. Molecular 
BIOLOGY OF THE Gbnb, 4th ED., The Benjamin/Cummings Publishing Company, Inc., Menlo 
Park, Calif., 1987; and Lemaigre and Rousseau, (1994) 5iocAem. J. 303:1-14. 
[0064] "Transcription regulatory elements" are promoter-associated DNA sequences that 
bind regulatory molecules, resulting in the modulation of the frequency with which 
transcription is initiated. Transcription regulatory elements can be classified as enhancers or 
suppressors of transcription. 

[0065] As used herein, tire term "reporter gene" is used herein to denote a gene that, when 
expressed in a cell, produces a quantifiable phenotypic change in the cell. Preferred reporter 
genes include genes encoding enzymes. Particularly preferred enzymes are luciferase, 13- 
galactosidase, and chloramphenicol acetyltransferase. Assays for these enzymes are known in 
the art. See, for example, Seed and Sheen (1988) Gene 67:271-277; Todaka et al. (1994) J. 
Biol. Chem. 269:29265-29270; Guarente et al. (1981) Proc. Natl. Acad. Sd. USA 78:2199- 
2203; Mellon et al. (1989) Proc. Natl. Acad. Sd. USA 86:4887-4891; and Brasier et al. (1989) 
BioTechniques 7:1116-1122, which are incorporated herein by reference m their entirety. 
Reporter genes, assay kits, and other materials are available commercially from suppliers such 
as Promega Corp. (Madison, Wis.) and GIBCO BRL (Gaithersburg, Md.). 
[0066] The inventors have discovered a method for developing a rapid method for 
knocking m DNA fragments into target loci of interest to regulate gene expression and/or 
fimction as well as the, abihty to rapidly tag an exon of a gene to study expression as well as 
for enhancing chromosomal site-specific gene ampUfication. The process entails the use of 
targetmg cassettes that are generated via PGR usmg primers contaming 20, 25, 30, 35, 40, 45, 
50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, or 120 nucleotides of sequence with 
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homology to a particulai- chromosomal locus. Each promoter expression cassette contains 
DNA elements that can produce constitutive-, inducible- or suppressed-expression, which are 
juxtaposed to a constitutively expressed selectable marker (See Fig. 1). Each exon-tag 
cassette contains DNA sequences encoding for reporter elements that can be monitored using a 
number of detection methods such as but not limited to green fluorescent protein, luciferase, 
etc., which is fused in-frame to a selectable marker (See Fig. 4). Each DHFR expression 
cassette contains DNA elements that constitutively express DHFR which are juxtaposed to a 
constitutively active selectable marker. In aU cases, targeting fragments are generated and 
transfected into eukaryotic cell hosts. 

[0067] Enhanced site-specific homologous recombination of LSTFs is facilitated in each 
target cell by suppressing the endogenous MMR of the host via the expression of a dominant 
negative MMR gene mutants or through exposure to chemical inhibitors as described 
(Nicolaides, N.C. et al (1998) "A naturally occurring hPMS2 mutation can confer a dominant 
negative mutator phenotype" Mol. Cell. Biol. 18:1635-1641; U.S. Patent No. 6.146,894 to 
Nicolaides et al; Lipkin et al. (2000) "MLH3: a DNA mismatch repair gene associated with 
-mammalian microsatellite instabiUty" Nat. Genet. 24:27-35). 

[0068] In one aspect of the invention, the methods taught here are useful for the generation 
;,pf cells that over express or suppress tiie expression of a gene(s) to ehicidate gene function. 
Such cells may be used as tools to identify compounds that can alter the activity of a given 
gene product and/or induced pathway in comparison to parental lines. The cell host may be 
derived from a variety of sources, for example, nonnal or pathogenic tissues or organisms. 
The targeting fragment may be used, for example, to prevent, inhibit or terminate expression 
of a particular gene to elucidate its fimction, if any, in a particular disease-associated pathway. 
Moreover, such cell lines may now be used to screen compound libraries to identify molecules 
that act as agonists or antagonists for pharmaceutical product development. One such example 
is the ability to over express orphan G coupled receptors (GCR) in a cell line and expose the 
line to compound libraries to identify Hgands or agonists. The ability to over express a GCR 
from the genome via enhanced promoter activity or chromosomal specific amplification is 
more beneficial than cloning and estabUshing stable transgenes, which in many instances 
produce very low or no expressed product. Finally, the abiHiy to generate cell hnes that can 
over produce a secreted or endogenous gene product from a host's genome enhances their use 
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for biological product manufacturing thus bypassing the need for introducing multiple plasmid 
copies into host cell lines and establishing stable expression. 

[0069] In another aspect of the invention, the methods are useM for the generation of cells 
with endogeiious genes containing a tagged exon for monitoring gene expression profiles. • 
Such cells may be used as tools to monitor physiological activity in the presence or absence of 
exogenous factors in comparison to control lines. The cell host may be derived &om, for 
example, noimal or pathogenic organisms to study the expression profile of disease associated 
genes under noimal or stimulated conditions. Pharmacological studies can be performed in 
untreated cultures or in cultures treated with biological or chemical factors to screen for 
therapeutic molecules. The cell lines produced by the method of the invention containing 
tagged exons are also useful for monitoring compound toxicity and efficacy of modulating 
gene expression. 

[00701 Reporter elemaits may be included in the constructs of the invention. Reporter 
elements mclude assayable proteins which can be detected and/or quantified. Examples of 
reporter genes include, but are not limited to luciferases, such as those known in the art, and 
may include firefly luciferase (amino acid, SEQ ID NO:34, nucleic acid SEQ ID NO:35); 
bacterial luciferases, and Renilla luciferase (amino acid, SEQ ID NO:38, nucleic acid SEQ ID 
NO:39) and green fluorecence protein (amino acid, SEQ ID NO:36, nucleic acid SEQ ID 
NO:37). Other reporter elements include genes encoding enzymes, which convert a substrate 
that is subsequently detected. Examples include, but are not limited to B-galactosidase, and 
chloramphenicol acetyl transferase. 

[0071] The reporter gene may be visualized in a variety of assays including both in vivo 
and in vitro assays. For example, but not by way of limitation, reporter genes can be 
visualized by positron emission tomogr£^hy (PET), single photon emission computed 
tomography (SPECT), magnetic resonance imaging (MSI), and flurorescence with wild-type 
and mutant green fluorescent protein and luciferase (see Ray et al. (2001) "Monitoring gene 
therapy with reporter gene imaging" Semin. Nucl. Med. 31(4):312-320). 
[0072] For example, in living animals it has been shown that Renilla luciferase reporter 
gene could be used and detected to follow gene expression in vivo (Bhaumik and Gambhir 
(2002) Proc. Natl. Acad. Sci. USA 99(l):377-382). In this study, a highly sensitive cooled 
charge-coupled device (CCD) camera provided images of photon counting. Such a device is 
suitable for use in the present invention, and is available fi:om Xenogen (In Vivo Imaging 
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System "IVIS"). A description of the protocols used to image the reporter gene is known in 
the art (Bhaumik and Gambhir (2002) Proc. Natl. Acad. Sci. USA 99(l):377-382) and are 
suitable for use in the present invention as assays to monitor expression of reporter genes. 
[0073] In another example, a biflmctional molecule comprising Renilla luciferase and 
Green Fluorescent Protein may be used as a reporter gene to monitor the integration and/or 
expression of the LSTF construct. In a study describing the bifiinctional construct, a ruc-gQ) 
fusion gene construct was created by fusing cDNAs for RenUla luciferase (rue) and 
"humanized" GFP (g^) from Aequorea in frame, and the construct was subsequently 
expressed in naammalian cells. The transfomed cells exhibited both J?e«z7/a luciferase activity 
in the presence of the substrate, coelenterazine, and GFP fluorescence upon excitation with 
UV Ught. Li animal experiments, the Ught emission from the fusion construct was detected 
extemaUy in the organs and tissues of live animals (Wang et al. (2002) Mol. Genet, Genomics 
268(2): 1 60-168). Such a bifunctional construct is suitable for use in the present invention as a 
reporter gene. 

[0074] hi another embodiment of the mvention, proteins expressed from LSTFs may be 
visualized in vitro or in vivo using labeled antibodies, or fragments thereof (such as Fab or 
F(ab')2 fragments) which specifically bind to the protein of interest. Antibodies may be 
labeled using any means known in the art that allow visualization or assaying. Such labels 
mclude, but are not limited to fluorescent conjugates, and radioactive conjugates. Fluorescent 
conjugates include luciferases, green fluorescent protein and derivatives, rhodamine, and 
fluorescein. Radioactive compounds include those containing ^^'l, "*In, '^^I, ^^mTc, ^^P 
^H, and ^''C. The antibody or fragments thereof can be labeled with such reagents using 
techniques known in the art (see, for example, Wensel and Meares, Radioimmunoimaging and 
Radioimmunotherapy, Esevier, New York (1983); D. Colcher et al. (1986) "Use of 
Monoclonal Antibodies as Radiopharmaceuticals for the Locahzation of Human Carcinoma 
Xenografts in Athymic Mice" Meth. Enzymol. 121 :802-816). 

[0075] hi yet another embodiment, signaling mechanisms that may be affected by proteins 
expressed by LSTFs may be monitored or assayed for functionahty. hi a non-limitmg 
example, calcium flux may be measured in cells expressing receptors that affect calcium flux 
upon stmiulation. Examples of protocols that measure calcium mobihzation are the FLIPR® 
Calcium Assay Kit, and various protocols using the calcium binding, fluorescent dye, Fluo-3 
AM. The protocols are known to those of skill in the art and may be used to measure calcium 
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mobilization in cells expressing various proteins (such as G-protein coupled receptors, for 
example) which have been expressed from an LSTF. 

[0076] The LSTF of the invention may be constructed to uiclude a variety of genetic 
elements, depending on the application of the LSTF. For example, in some embodiments, a 
LSTF may include a promoter operatively linked to a selectable marker. In other 
embodiments, the LSTF may include a promoter operatively linked to a selectable marker and 
a second protein encoding sequence operatively linked to a second promoter. In constructs 
with more than one piotem encoding sequence, an internal ribosome entry site (IRES) may 
also be included. An IRES element is a regulatory element found in some viral sequences and 
some cellular RNAs that enhances translation of a second gene product in a bicistronic 
eukaiyotic expression cassette (Kaufinan et al. (1991) Nucl. Acids Res. 19:4485). An IRES 
element may be engineered between two of the coding sequences of the LSTFs of the 
invention. In other embodiments in which it is not necessary that a protein sequence is 
expressed, a promoter is not required. In such embodiments (e.g., embodiments in which 
exons are tagged) it is sufficient that a nucleic acid sequence is present on the construct which 
may be detectable tlirough molecular analysis. In embodiments in which chromosomal loci 
are targeted for ampUfication, constructs include a promoter operatively linked to a 
dihydrofolate reductase encodmg sequence, preferably with a second promoter operatively 
linked to a selectable marker. 

[0077] A selectable marker may be a gene conferring drug-resistance to the cell. Non- 
limiting examples of such drug resistance selectable markers are genes for neomycin 
resistance, hygromycin resistance and zeocin resistance. 

[0078] In some embodiments of the invention, a locus control region (LCR) may be 
incorporated. An LCR is position and orientation dependent and may be used in a tissue 
specific manner. An LCR may be used in the LSTF of the invention in conjunction with a 
promoter in embodunents used for overproduction of protein. In a non-limiting example of 
use of an LCR, an LCR specific for lymphocytes may be used to produce high levels of 
antibodies in B cells using LSTFs that integrate through homologous recombination in the 
immunoglobulin locus. LCRs are known by persons skilled in the art. 

[0079] The constructs are ampHfied in a polymerase chain reaction (PCR) using 5' and 3' 
primers that have been designed to include nucleic acid sequence that is homologous to a 
selected portion of the genome of a cell that is targeted for homologous recombination. For 
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the 5' primer, which anneals to the (-) strand of the DNA in the PGR ampHfication, the 5'- 
most sequence of the 5' primer (about 20-120 nucleotides (nts)) is homologous to the selected 
portion of the genome targeted for homologous recombination. The 3' most portion of the 5' 
primer comprises nucleotides that are homologous to the 5' portion of the construct to be 
amplified. For the 3 ' primer, which amieals to the (+) strand of the DNA in the PGR reaction, 
the 5'-most sequence of about 20-120 nucleotides (nts) is homologous to the selected portion 
of the genome targeted for homologous recombination. The 3' most portion of the 3' primer 
comprises nucleotides that are homologous to the 3' portion of the construct to be ampUfied. 
The PGR reaction conditions are not particularly limited. PGR reactions and variations for 
optimization are weU known in the are and routine optimization of the reactions, including 
choice of buffers, polymerases, additives, etc., are in the purview of the skilled artisan. 
[0080] According to one aspect of the invention, a polynucleotide encoding for a dominant 
negative form of a MMR protein is introduced mto a cell. The gene can be any dominant 
negative allele encoding a protein, which is part of a MMR complex. The dominant negative 
allele can be naturally occurring, or made in the laboratory. The dominant negative allele may 
be, for example a PMS2 allele and homologs tliereof that confer a dominant negative 
phenotype. For example, the allele may be a PMS2-134 allele, a PMSR2 allele or a PMSR3 
allele. Jhe polynucleotide can be in the form of genomic DNA, cDNA, RNA, or a chemically 
synthesized polynucleotide. 

[0081] The polynucleotide can be cloned into an expression vector containing a 
constitutively active promoter segment (such as but not limited to GMV, SV40, Elongation 
Factor (EF) or LTR sequences) or to inducible promoter sequences such as the steroid 
inducible pIND vector (Invitrogen), tetracycline, or mouse mammary tumor virus (MMTV), 
where the expression of the dominant negative MMR gene can be regulated. The 
polynucleotide can be introduced into the cell by transfection. As used herein, a "promoter" is 
a DNA sequence that encompasses binding sites for ?ra/7^-acting transcription factors. 
Promoters, when positioned 5' of protein encoding sequences fomi abasic transcriptional unit. 
[0082] According to another aspect of the invention, a targeting fragment containing 20, 
25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, or 120 nts of 5' 
and 3' homologous sequence is transfected into MMR deficient ceU hosts, the ceU is grown 
and screened for clones containing chromosomes whereby the targeting fragment has been 
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integrated. MMR defective cells may be of human, primates, mammals, rodent, fish, plant, 

fungal, yeast or of the prokaryotic kingdom. 

[0083] Transfection is any process whereby a polynucleotide is introduced into a cell. The 
process of transfection can be carried out in a living animal, e.g., using a vector for gene 
therapy, or it can be carried out in vitro, e.g., using a suspension of one or more isolated cells 
in culture. The cell can be any type of eukaryotic cell, including, for example, cells isolated 
from humans or other primates, mammals or other vertebrates, invertebrates, and single celled 
organisms such as proto2»a, yeast, or bacteria. 

[0084] In general, transfection wiU be carried out using a suspension of cells, or a single 
cell, but other methods can also be applied as long as a sufficieait firaction of the treated cells or 
tissue incorporates the polynucleotide so as to allow transfected cells to be grown and utilized. 
Techniques for transfection are well known. Available techniques for introducing 
polynucleotides include but are not limited to electroporation potter et al (1988) Proc. Natl 
Acad. Sci. USA 81:7161), transduction, cell fusion, the use of calcium chloride Sambrook et 
al. Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, New York, 
2000) or calcium phosphate precipitation (Wigler et al. (1980) Proc. Natl. Acad. Sci. USA 
77:3567), polyethylene-induced fusion of bacterial protoplasts with mammalian cells 
(Scliaffiier et al. (1980) Proc. Natl. Acad. Sci. USA 77:2163), and packaging of the 
polynucleotide together with lipid for fusion with the cells of interest (e.g., using Lipofectin® 
Reagent and Lipofectamine® Reagent (Gibco BRL, Gaithersburg, MD). Once a cell has been 
transfected with the targeting fragment containing a selectable marker, the cell can be grown 
and reproduced in culture. If the transfection is stable, such that the selectable marker gene is 
expressed at a consistent level for many cell generations, then a cell line results. Upon 
chromosomal integration, MMR is restored in the host cell, and the genetic stability of the host 
is restored. 

[0085] An isolated cell includes cells obtained from a tissue of humans, animals, plants or 
fiaiigi by mechanically separating out individual cells and transferring them to a suitable cell 
culture medium, either with or without pretreatment of the tissue with ereymes, e.g., 
coUagenase or trypsin. Such isolated cells are typically cultured in the absence of other types 
of cells. Cells selected for the introduction of a targeting fragment may be derived from a 
eukaryotic organism in the form of a primary cell culture or an immortalized cell line, or may 
be derived from suspensions of single-celled organisms. 
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[0086] Megration of the targeting fragment can be detected by analyzing the 
chromosomal locus of interest for alterations in the genotype of the cells or whole organisms, 
for example by exanuning the sequence of genomic DNA, cDNA, RNA, or polypeptides 
associated with the gene of interest. Integration can also be detected by screening for the 
expression levels of the targeted locus for altered expression profiles, or chimeric transcripts 
tlirough biochemical methods or nucleic acid monitoring. Techniques for analyzing nucleic 
acids and proteins are well known in the art. Techniques include, but are not limited to 
Southem analysis, northern analysis, PGR, reverse transcriptase-PCR (rt-PCR), restriction 
digest mapping, western blot, enzyme-linked immunosorbent assays (ELISA), 
radioimmunoassay, immunoprecipitation, and well-known variations of these techniques. 
10087] Examples of mismatch repair proteins that can be used for dominant negative 
MMR inhibitors and nucleic acid sequences include the following: mouse PMS2 protein 
(SEQ ID NO:l); movsoPMS2 cDNA) (SEQ ID N0:2); human PMS2 protein (SEQ ID N0:3); 
human PMS2 cDNA (SEQ ID N0:4); human PMSl protein (SEQ ID N0:5); human FMSl 
cDNA (SEQ ID NO:6); human MSH2 protein (SEQ ID N0:7); human MSH2 cDNA (SEQ ID 
N0:8); human MLHl cDNA (SEQ ID N0:9); human MLHI cDNA (SEQ ID NO:10); human 
PMS2-134 protein (SEQ ID NO: 11); human PMS2-134 cDNA (SEQ ID N0:12); human 
MSH6 protein (SEQ ID NO:40); human MSH6 cDNA (SEQ ID NO:41); human PMSR2 
protein (SEQ ID NO:42); human PMSR2 cDNA (SEQ ID NO:43); human PMSR3 protein 
(SEQ ID NO:44); and human PMSR3 cDNA (SEQ ID NO:45). 

[00881 The LSTFs of the invention may also be used to insert nucleic acid sequences 
through homologous recombination in cells that are naturally deficient in mismatch repair. 
Furthermore, cells may be rendered deficient in mismatch repair before, after or 
simultaneously with the introduction of the LSTFs. 

[0089] The invention also employ chemical inhibitors of mismatch repair, such as 
described in WO 02/054856 Moiphotek Inc. "Chemical foliibitors of Mismatch Repair," 
which is specifically incorporated herein in it entirety. Chemicals that block MMR. and 
thereby render cells hypemiutable, efficiently introduce mutations in cells and genes of 
interest as well as facilitate homologous recombination m treated cells. In addition to 
destabihzing the genome of cells exposed to chemicals that inhibit MMR activity may be done 
transiently, allowing cells to become hypermutable, and removing the chemical exposure after 
the desired effect (e.g., a mutation in a gene of interest) is achieved. The chemicals that inhibit 
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MMR activity that are suitable for use in the invention include, but are not limited to, 
antliracene derivatives, nonhydrolyzable ATP analogs, ATPase inhibitors, antisense 
oligonucleotides that specifically anneal to polynucleotides encoding mismatch repair 
proteins, DTSf A polymerase inliibitors, and exonuclease inhibitors. 

[00901 Examples of ATP analogs that are useM in blocking MMR activity include, but 
are not lunited to, nonhydrolyzable forms of ATP such as AMP-PNP and ATP[gamma]S 
block the MMR activity (GaUo et al (1999) Nucl Acids Res. 27:2325-2331; Allen et al. 
(1997) EAdBO J. 16:4467-4476; Bjomsone^ a/. (2000) Biochem. 39:3176-3183). 
[0091] Examples of nuclease inhibitors that are useful in blocking MMR activity include, 
but are not limited to analogs of N-ethylmaleimide, an endonuclease inhibitor (Huang et al. 
(1995) Arch. Biochem. Biophys. 316:485), heterodimeric adenine-chain-acridine compounds, 
exonulcease HI mhibitors (Behnont et al. (2000) Bioorg Med Chem Lett (2000) 10:293-295), 
as well as antibiotic compounds such as heliquinomycin, which have helicase inhibitory 
activity (Chino et al. (1998) J. Antibiot. (Tokyo) 51:480-486). 

[0092] Examples of DNA polymerase inliibitors that are useful m blocking MMR activity 
include, but are not Hmited to, analogs of actinomycin D (Martin et al. (1990) /. Immunol 
145:1859), aphidicolin (Kuwakado et al (1993) Biochem. Pharmacol 46:1909) l-(2'-Deoxy- 
2'-fluoro-beta-L-arabinofuranosyl)-5-methyluracil (L-FMAU) (Kukhanova et al (1998) 
Biochem Pharmacol 55:1181-1187), and 2',3'-dideoxyribonucleoside 5'-triphosphates 
(ddNTPs) (One et al (\9%A) Biomed. Phannacother. 38:382-389). 

[0093] In yet another aspect of the invention, antisense oligonucleotides are administered 
to cells to disrupt at least one fiinction of the mismatch repair process. The antisense 
polynucleotides hybridize to MMR polynucleotides. Both ftdl-length and antisense 
polynucleotide firgaments are suitable for use. "Antisense polynucleotide fragments" of the 
invention include, but are not Umited to polynuclotides that specifically hybridize to an MMR 
encoding RNA (as determined by sequence comparison of nucleotides encodmg the MMR to 
nucleotides encoding other known molecules). Identification of sequences that are 
substantially unique to MMR-encodiiig polynucleotides can be ascertained by analysis of any 
publicly available sequence database and/or with any commercially available sequence 
comparison programs. Antisense molecules may be generated by any means hxcluding, but 
not limited to chemical synthesis, expression in an in vitro transcription reaction, through 
expression in a transformed cell comprising a vector that may be transcribed to produce 
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antisense molecules, through restriction digestion and isolation, through the polymerase chain 
reaction, and the like. 

[O094] Those of skill in the art recognize that the antisense oligonucleotides that inhibit 
mismatch repair activity may be predicted using any MMR genes. Specifically, antisense 
nucleic acid molecules comprise a sequence complementary to at least about 10, 15, 25, 50, 
100, 250 or 500 nucleotides or an entire MMR encoding sequence. Preferably, the antisense 
oligonucleotides comprise a sequence complementary to about 15 consecutive nucleotides of 
the coding strand of the MMR encoding sequence. 

[0095] In one embodiment, an antisense nucleic acid molecule is antisense to a "coding 
region" of the coding strand of a nucleotide sequence encoding an MMR protein. The coding 
strand may also include regulatory regions of the MMR sequence. The term "coding region" 
refers to the region of the nucleotide sequence comprising codons which are translated into 
amino acid residues (e.g., the protein coding region of human PMS2 corresponds to the coding 
region). In anotlier embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence encoding an MMR protein. 
The term "noncoding region" refers to 5' and 3' sequences which flank the coding region that 
are not translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions 
(UTR)). 

[00961 Preferably, antisense oligonucleotides are directed to regulatory regions of a 
nucleotide sequence encoding an MMR protein, or mRNA corresponding thereto, including, 
but not limited to, the initiation codon, TATA box, enhancer sequences, and the Uke. Given 
the coding strand sequences provided herein, antisense nucleic acids of the invention can be 
designed according to the rules of Watson and Crick or Hoogsteen base pairing. The antisense 
nucleic acid molecule can be complementary to the entire coding region of an MMR mRNA, 
but more preferably is an oligonucleotide that is antisense to only a portion of the coding or 
noncoding region of an MMR mRNA. For example, the antisense oligonucleotide can be 
complementary to the region surrounding the tianslation start site of an MMR mRNA. An 
antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 
nucleotides in length. 

[0097] As used herein tiie term "antinracene" refers to tiie compound anthracene. 
However, when referred to in the general sense, such as "anthracenes," "an anthracene" or 
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"the anthracene," such terms denote any compound that contains the fiased triphenyl core 
structure of anthracene, i.e., 



regardless of extent of substitution. 

[0098] In certain preferred embodiments of the invention, the anthracene has the formula: 



wherein Ri-Rio are independently hydrogen, hydroxyl, amino, alkyl, substituted alkyl, 
alkenyl, substituted alkenyl, alkynyl, substituted alkynyl, 0-alkyl, S-alkyI, N-alkyl, 0-alkenyl, 
S-alkenyl, N-aIkenyl,0-alkynyl, S-alkynyl, N-alkynyl, aryl, substituted aryl, aryloxy, 
substiiuted aryloxy, heteroaryl, substituted heteroaryl, aralkyloxy, arylalkyl, alkylaryl, 
alkylaryloxy, arylsulfonyl, alkylsulfonyl, alkoxycarbonyl, aryloxycarbonyl, guanidino, 
carboxy, an alcohol, an amino acid, sulfonate, alkyl sulfonate, CN, NO2, an aldehyde group, 
an ester, an ether, a crown ether, a ketone, an organosulfiir compound, an organometallic 
group, a carboxylic acid, an organosilicon or a carbohydrate that optionally contains one or 
more alkylated hydroxyl groups; 

wherein said heteroalkyl, heteroaryl, and substituted heteroaryl contain at least one 
heteroatom that is oxygen, sulfur, a metal atom, phosphorus, silicon or nitrogen; 

wherein said substituents of said substituted alkyl, substituted alkenyl, substituted 
alkynyl, substituted aryl, and substituted heteroaryl are halogen, CN, NO2, lower alkyl, aryl, 
heteroaryl, aralkyl, aralkyloxy, guanidino, alkoxycarbonyl, alkoxy, hydroxy, carboxy and 
amino; and 

wherein said amino groups optionally substituted with an acyl group, or 1 to 3 aryl or 
lower alkyl groups; or wherein any two of Ri-Rio can together form a polyether; 
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or wherein any two of Ri-Rjo can, together with the interveimig carbon atoms of the 
anthracene core, form a crown ether. 

[0099] As used herein, "alkyl" refers to a hydrocarbon containing from 1 to about 20 
carbon atoms. Alkyl groups may stiaight, branched, cycUc, or combinations thereof. Alkyl 
groups thus include, by way of illustration only, methyl, ethyl, propyl, isopropyl, butyl, 
isobutyl, cyclopentyl, cyclopentyhnethyl, cyclohexyl, cyclohexybnethyl, and the like. Also 
included within the definition of "alkyl" are fused and/or polycycUc aliphatic cycUc ring 
systems such as, for example, adamantane. As used herein Ihe term "alkenyl" denotes an alkyl 
group having at least one carbon-carbon double bond. As used herein the term "alkynyl" 
denotes an alkyl group having at least one carbon-carbon triple bond. 

[0100] In some preferred embodiments, the alkyl, alkenyl, alkynyl, aayl, aryloxy, and 
heteroaryl substituent groups described above may bear one or more further substituent 
groups; that is, they may be "substituted", hi some prefenred embodiments these substituent 
groups can include halogens (for example fluoiine, chlorine, bromine and iodine), CN, NO2, 
lower alkyl groups, aiyl groups, heteroaryl gi-oups, aralkyl groups, aralkyloxy groups, 
guanidino, alkoxycarbonyl, alkoxy, hydroxy, carboxy and amino groups. In addition, the alkyl 
and aryl portions of aralkyloxy, arylalkyl, arylsulfonyl, alkylsulfonyl, alkoxycarbonyl, and 
aryloxycarbonyl groups also can bear such substituent groups. Thus, by way of example only, 
substituted alkyl groups include, for example, alkyl groups fluoro-, chloio-, bromo- and 
iodoalkyl groups, aminoalkyl groups, and hydroxyalkyl groups, such as hydioxymethyl, 
hydioxyethyl, hydroxypropyl, hydroxybutyl, and the Uke. In some preferred embodiments 
such hydroxyalkyl groups contain firom 1 to about 20 carbons. 

[0101] As used herein the term "aryl" means a group havmg 5 to about 20 carbon atoms 
and which contains at least one aromatic ring, such as phenyl, biphenyl and naphthyl. 
Preferred aryl groups include unsubstituted or substituted phenyl and naphthyl groups. The 
term "aryloxy" denotes an aryl group that is bound through an oxygen atom, for example a 
phenoxy group. 

[0102] hi general, the prefix "hetero" denotes the presence of at least one hetero (i.e., non- 
carbon) atom, which is in some preferred embodiments independently one to three O, N, S, P, 
Si or metal atoms. Thus, the term "heteroaryl" denotes an aryl group in which one or more 
ring carbon atom is replaced by such a heteroatom. Preferred heteroaryl groups include 
pyridyl, pyrimidyl, pyrrolyl, fiuyl, thienyl, and imidazolyl groups. 
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[0103] The term "aralkyl" (or "arylalkyl") is intended to denote a group having from 6 to 
15 carbons, consisting of an alkyl group that bears an aryl group. Examples of aralkyl groups 
include benzyl, phenethyl, beuzhydryl and naphthylmethyl groups. 

[0104] The term "alkylaryl" (or "aUcaryl") is intended to denote a group having from 6 to 
15 carbons, consisting of an aryl group that bears an alkyl group. Examples of aralkyl groups 
include methylphenyl, ethylphenyl and methylnaphthyl groups. 

[0105] The term "arylsulfonyl" denotes an .aryl group attached through a sulfonyl group, 
for example phenylsulfonyl. The term "alkylsulfonyl" denotes an alkyl group attached 
through a sulfonyl group, for example methylsulfonyl. 

[0106] The term "alkoxycarbonyl" denotes a group of formula -C(=0)-0-R where R is 
alkyl, alkenyl, or aUcynyl, where the alkyl, aUcenyl, or alkynyl portions thereof can be 
optionally substituted as described herein. 

[0107] The term "aryloxycarbonyl" denotes a group of foraiula -C(=0)-0-R where R is 
aryl, where the aryl portion thereof can be optionally substituted as described herein. 
[0108] The terms "arylaUcyloxy" or "aralkyloxy" are equivalent, and denote a group of 
formula -O-R'-R'', where R^ is R is alkyl, alkenyl, or alkynyl which can be optionally 
substituted as described herein, and wherein R'' denotes a aryl or substituted aryl group. 
[0109] The terms "alkylaryloxy" or "alkaiyloxy" are equivalent, and denote a group of 
formula -O-R'-R", where r' is an aiyl or substituted aryl group, and R" is alkyl, alkenyl, or 
alkynyl which can be optionally substituted as described herein. 

[0110] As used herein, the term "aldehyde group" denotes a group that bears a moiety of ° 
formula -C(=0)-H. The term "ketone" denotes a moiety containing a groiq) of formula -R- 
C(=0)-R=, where R and R= are independently alkyl, alkenyl, alkynyl, aryl, heteroaryl, aralkyl, 
or alkaryl, each of which may be substituted as described herein. 

[0111] As used herein, the term "ester" denotes a moiety having a group of formula -R- 
C(=0)-0-R= or -R-0-C(=0)-R= where R and R= are independently alkyl, alkenyl, alkynyl, 
aryl, heteroaryl, aralkyl, or alkaryl, each of which may be substituted as described herein. 
[0112] The term "ether" denotes a moiety having a group of fomiula -R-0-R= or where R 
and R= are independently alkyl, alkenyl, alkynyl, aryl, heteroaryl, aralkyl, or alkaryl, each of 
which may be substituted as described herein. 

[0113] The term "crown ether" has its usual meaning of a cyclic ether containing several 
oxygen atoms. As used herein the term "organosulfur compotmd" denotes aliphatic or 
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aromatic sulfiir containing compounds, for example thiols and disulfides. The term 
"organometalUc group" denotes an organic molecule containing at least one metal atom. 
[0114] The term "organosilicon compound" denotes aUphatic or aromatic silicon 
containing compounds, for example alkyl and aryl silanes, 

[0115] The term "carboxylic acid" denotes a moiety having a carboxyl group, other than 
an amino acid. 

[0116] As used herein, the term "amino acid" denotes a molecule containing both an 
ammo group and a carboxyl group. In some preferred embodiments, the amino acids are a-, P- 
, Y- or S-aminc acids, including their stereoisomers and racemates. As used herein the term 
"L-ammo acid" denotes an a-amino acid havmg the L configuration around the a-carbon, that 
is, a carboxyUc acid of general formula CH(COOH)(NH2)-(side chain), having the L- 
configuration. The term "D-amino acid" similarly denotes a carboxylic acid of general 
formula CH(C00H)(NH2Hside chain), having the D-configuration around the a-carbon. Side 
chains of L-amino acids include naturally occurring and non-naturally occumng moieties. 
Non-naturally occuning (i.e., unnatural) amino acid side chains are moieties that are used m 
place of naturally occurring ammo acid side chains in, for example, amino acid analogs. See, 
for example, Lehmnger, Biochemistiy, Second Edition, Worth PubUshers, Inc, 1975, pages 72- 
77, incorporated herein by reference. Amino acid substitueats may be attached through their 
carbonyl groups through the oxygen or carbonyl carbon thereof, or through their amino 
groups, or through functionalities residing on their side chain portions. 
[0117] As used herein "polynucleotide" refers to a nucleic acid molecule and includes 
genomic DNA cDNA, RNA, mRNA and the like. 

[0118] As used herein "antisense ohgonucleotide" refers to a nucleic acid molecule that is 
complementary to at least a portion of a target nucleotide sequence of interest and specifically 
hybridizes to the target nucleotide sequence under physiological conditions. 
[0119] For further information on the background of the mvention the following 
references may be consulted, each of which, along with other references cited herein, is 
incorporated herein by reference in its entirety: 

References: 

(1) Baker, S.M. et al. (1995) "Male defective m the DNA mismatch repair gene PMS2 exhibit 
abnomial chromosome synapsis in meiosis" Cell 82:309-3 19. 
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(2) Modrich, P. (1994) "Mismatch repair, genetic stability, and cancer" Science 
266:1959-1960. 

(3) Jiricny, J. and M. Nyslrom-Lahti (2000) "Mismatch repair defects in cancer" Curr. Opin. 
Genet. Dev. 10:157-161. 

(4) Prolla, T.A. et al. (1994) 'TVELHl, PMSl, and MSH2 interaction during the initiation of 
DNA mismatch repair in yeast" Science 264:1091-1093. 

(5) Strand, M. et al. (1993) "Destabilization of tracts of simple repetitive DNA in yeast by 
mutations affecting DNA mismatch repair" Nature 365:274-276. 

(6) Perucho, M. (1996) "Cancer of the microsatellite mutator phenotype" Biol. Chem. 
377:675-684. 

(7) Liu, T. et al. (2000) "Microsatellite instability as a predictor of a mutation in a DNA 
mismatch repair gene in familial colorectal cancer" Genes Chrom. Cancer 27:17-25. 

(8) Nicolaides, N.C., et al. (1995) "Genomic organization of the human PMS2 gene family" 
Genomics 30:195-206. 

[0120] The above disclosure generally describes the present invention. A more complete 
understanding can be obtained by reference to the following specific examples, which are 
provided herein for purposes of illustration only, and are not intended to limit the scope of the 
invention. 

EXAMPLES 

EXAMPLE 1: Stable expression of dominant negative mismatch repair (MMR) genes in 
cells results in MMR inactivity. 

[0121] Expression of a dominant negative allele in an otherwise mismatch repair (MMR) 
proficient cell can render these host cells MMR deficient (Nicolaides, N.C. et al. (1998) Mol. 
-29- 


wo 03/062435 


PCT/US03/01361 


Cell Biol. 18:1635-1641, U.S. Patent No. 6,146,894 to Nicolaides et al). The creation of 
MMR deficient cells can lead to the generation of genetic alterations throughout the entire 
genome of a host's offspring, yielding a population of genetically altered offspring or siblings 
that have an enhanced rate of homologous recombination. This patent application teaches of 
the use of dominant negative MMR genes in cells, including but not limited to rodent, human, 
primate, yeast, insect, fish and prokaiyotic cells with enhanced rates of homologous 
recombination followed by the introduction of locus specific targeting fragments (LSTFs) that 
can alter the e3q>ression of a chromosomal locus or integrate into a given exon of a gene for 
facilitated analysis of gene expression. 

[0122] To demonstrate the ability to create MMR defective mammalian cells with elevated 
rates of homologous recombination using dominant negative alleles of MMR genes, we first 
transfected a MMR proficient human cell line with an expression vector containing the human 
the previously published dominant negative PMS2 mutant referred herein as PMS134 (cell 
line referred to as 293PMS134), or with no insert (cell line referred to as 293 vec) into human 
embryonic kidney cells (HEK293). A fragment containing the PMS134 cDNA was cloned 
into the pEF expression vector, which contains the constitutively active elongation factor 
promoter along with the neomycin resistance gene as selectable marker. The results showed 
that the PMS134 mutant could exert a robust dominant negative effect, resultmg m 
biochemical and genetic manifestations of MMR deficiency. A brief description of the 
methods is provided below. 

[0123] A halhnark of MMR deficiency is the generation of unstable microsatellite repeats 
m the genome of host cells. This phenotype is referred to as microsatellite instability (MI). 
ME consists of deletions and/or insertions within repetitive mono-, di- and/or tri nucleotide 
repetitive sequences throughout the entire genome of a host cell. Extensive genetic analysis 
eukaryotic cells have found that the only biochemical defect that is capable of producing MI is 
defective MMR. In light of this unique feature that defective MMR has on promoting MI, it is 
now used as a biochemical marker to survey for lack of MMR activity within host cells. 
[0124] A method used to detect MMR deficiency in eulcaryotic cells is to employ a 
reporter gene that has a polynucleotide repeat inserted within the coding region that disrupts 
its reading frame due to a frame shift. In the case where MMR is defective, liie reporter gene 
will acquire random mutations (i.e. insertions and/or deletions) within the polynucleotide 
repeat yielding clones ttiat contain a functional reporter gene. An example of the ability to 
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alter desired genes via defective MMR comes from experiments using HEK293 cells 
(described above), where a mammalian expression construct containing a defective B- 
galactosidase gene (referred to as pCAR-OF) was transfected into 293PMS134 or 293vec cells 
as described above. The pCAR-OF vector consists of a 13-galactosidase gene containing a 29- 
basepair poly-CA tract inserted at the 5' end of its coding region, which causes the wild-type 
reading frame to shift out-of-frame. This chimeric gene is cloned into the pCEP4, which 
contains the constitutively cytomegalovirus (CMV) promoter upstream of the cloning site and 
also contains the hygromycin-resistance (EIYG) gene that allows for selection of cells 
containing this vector. The pCAR-OF reporter cannot generate B-galactosidase activity unless 
a frame-restoring mutation (i.e., insertion or deletion) arises following transfection into a host. 
Another reporter vector called pCAR-IF contains a fi-galactosidase in which a 27-bp poly-CA 
repeat was cloned into the same site as the pCAR-OF gene, but it is biologically active 
because the removal of a single repeat restores the open readmg frame and produces a 
ftmctional chimeric JJ-galactosidase polypeptide (not shown). In these proof-of-concept 
studies, 293PMS134 and 293 vec cells were transfected with the pCAR-OF reporter vector and 
selected for 17 days in neomycin plus hygromycin selection medium. After the 17* day, 
resistant colonies were stained for B-galactosidase production to determine the number of 
clones containing a genetically altered B-galactosidase gene. All conditions produced a 
relatively equal number of nepmycin/hygromycki resistant cells, however, only the cells 
expressing the PMS134 dominant negative allele (293PMS134) contained a subset of clones 
that were positive for B-galactosidase activity (Table 1). Table 1 shows flie data from these 
experiments, where cell colonies were stained in situ for B-galactosidase activity and scored 
for activity. Cells were scored positive if the colonies turned blue in the presence of X-gal 
substrate and scored negative if colonies remained white. Analysis of tripUcate experiments 
showed a significant increase in the number of fi-galactosidase positive cells in the 
293PMS134 cultures, while no B-galactosidase cells were seen in the control 293 vec cells. 


Table 1. Number of 293PMS134 and 293vec cells containing functional fi-galactosidase gene 
as a result of MMR deficiency. 


Cells 

White Colonies 

Blue Colonies 

% Clones with altered fi-gal 

293vec 

95 ±17 

0 

0/95 = 0% 

293PMS134 

88 ±13 

44±8 

44/132 = 33% 
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Table 1. B-galactDsidase expression of 293vec and 293PMS134134 cells transfectcd wiflipCAR-OF reporter 
vectors. Cells were transfected witii flie pCAR-OF B-galactosidase reporter plasmid. Tiansfected cells were 
selected in hygromycin and G41 8, expanded and stained with X-gal solution to measure for B-galactosidase 
activity (blue colored cells). 3 plates each were analyzed by microscopy. The results below represent the 
mean +/- standard deviation of these experiments. 

[0125] 293PMS134/pCAR-OF clones that were pooled and expanded also showed a 
number of cells that contained a functional B-galactosidase gene. No B-galactosidase positive 
cells were observed in 293vec cells transfected with the pCAR-OF vector (data not shown). 
These data demonstrate the ability of dominant negative alleles of MMR genes to suppress 
endogenous MMR activity. These cells are now primed for the introduction of locus specific 
targeting fragments for altering the expression or tagging the exon of specific genes within the 
chromosomal context of the host. 

In situ X-gal staining 

[0126] For in situ analysis, 100,000 cells are harvested and fixed in 1% gluteraldehyde, 
washed in phosphate buffered saline solution and incubated in 1 ml of X-gal substrate solution 
(0.15 M NaCl, 1 mM MgClj, 3.3 mM K4Fe(CN)6, 3.3 mM K3Fe(CN)6, 0.2% X-Gal ) in 24 
well plates for 2 hours at 37°C. Reactions are stopped in 500 mM sodium bicarbonate 
solution and transferred to microscope slides for analysis. Three plates each are counted for 
blue (fi-galactosidase positive cells) or white (J3-galactosidase negative cells) to assess for 
MMR inactivation. Table 1 shows the results from these studies. 

Table 1. Number of 293PMS134 and 293vec cells containing functional fi-galactosidase gene 
as a result of MMR deficiency. 


Cells 

White Colonies 

Blue Colonies 

% Clones with altered B-gal 

293vec 

95 +/- 17 

0 

0/95 = 0% 

293PMS134 

88 +/- 13 

44+/- 8 

447132 = 33% 


EXAMPLE 2: Generation of targeting cassettes for altered gene expression or tagged 
exons for expression profiling of host organisms. 

[0127] It has been previously reported that MMR defective cells have a higher rate of 
homologous recombination due to the decreased stringency for identical basepair matches of 
the target vector to the chromosomal locus. We observed the ability to gaierate an increased 
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rate of homologous recombination of fragments containing very short regions of homology in 
MMR defective cells obtained from colorectal cancer patents, such as the HCT116 cell line 
(N. Mcolaides personal observation), while homologous recombination in cells that were 
MMR proficient had undetectable integration of this type of fragment into a targeted locus 
such as the wild type HEK293 cell hne. 

[0128] To address the ability to use LSTFs contaming short areas of homology for rapid 
genome targeting of chromosomal loci, we employed the use of MMR defective 293 cells 
(293PMS134) that express the PMS134 dominant negative allele as described in Example 1. 
We then employed a LSTF that containing the Cytomegalovirus (CMV) promoter downstream 
of a constitutively expressed hygromycm cassette to monitor integration in the MMR defective 
line (see Figure 1). 

Generation of promoter locus-specific targeting fragments and cell lines. 
[0129] PGR products were amplified from the p4 plasmid, which contains a DNA insert 
with the Thymidine Kinase (Tk) promoter upstream of the hygromycin resistance (Hyg) gene 
followed by the SV40 potyadenylation signal and the cytomegalovirus (CMV) promoter. 
Plasmid was amphfied with primers containing 3' sequences that are homologous to the 
plasmid vector sequence region upstream of the Tk promoter and downstream of the CMV 
promoter. Each primer also contained 70 nt that were homologous to the genomic locus of 
various target genes at the start site of franscription. PCRs were typically carried out using 
buffers as previously described (Grasso, L. et al. (1998) "Molecular analysis of human 
uiterleukin-9 receptor transcripts in peripheral blood mononuclear cells. Identification of a 
splice variant encoding for a nonfimctional cell surfece receptor" J. Biol Cham. 273:24016- 
24024). Amplification conditions consisted of one cycle of 95°C for Sminutes, 30 cycles of 
94°C for 30 seconds/47°C for 30 seconds/72°C for 1 minute, and one cycle of 72°C for 2 
minutes. Primers pairs used for each gene are indicated in Table 2. LSTFs were analyzed by 
gel electrophoresis to ensure molecular weight. Products were then purified by spin colunm to 
remove primers, salts and unincorporated dNTPs from fragments. 

[0130] The generation of stable cell lines with promoter locus-specific targeted knock-in 
fragments was performed as follows. Briefly, 1x10^ HEK293 (human embiyonic kidney) cells 
stably expressing the PMS134 gene (see Example 1) were transfected with 1 p,g of purified 
PGR products from above using 3 ^1 Fugene6 (Invitrogen) and stable fransfectant pools were 
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generated by co-selection with 100 jig/ml hygromycin B and G418 (neomycin). Cultures were 
selected for 14 days in neomycin and hygromycin. Pools and clones were analyzed for locus 
specific integration using reverse transcriptase coupled PGR as described (Nicolaides, N.C. et 
al. (1997) "Merleukin 9: a candidate gene for asthma" Proc. Natl Acad. Sci. USA 94:13175- 
13180). Briefly, 1x10^ hygromycio/neomycin resistant cells transfected with various PGR 
fragments were lysed in 50 |j1 lysis buffer containing tris-edta and 1SIP40 and incubated for 10 
minutes on ice. Samples were added to oligo d(T) tubes in the presence of 50 (4,1 binding buffer 
and incubated 15' at RT with shaking. Lysates were aspirated and washed 2x each with high 
salt wash buffer followed by low salt wash buffer. 33 |j.ls Ix First-strand cDNA mix 
containing NTPs and reverse transcriptase was added to tubes and incubated 1 hr at 37°C. 67 
|il of a dHaO/ TAQ mixture was aliquoted into each sample along with appropriate gene- 
specific primers from Table 2. Amplification conditions consisted of one cycle of 95°C for 5 
minutes, 30 cycles of 94°C for 30 seconds/47°C for 30 seconds/72°C for 1 minute, and one 
cycle of 72°C for 2 minutes. 

[0131] Analysis of site-specific integration was carried out using four different previously 
studied loci that are expressed at undetectable levels in the HEK293 cell line and growth 
conditions used in these studies. The target genes were the human N-Ras (a signal 
transduction gene), beta-globin (a structural protein), ESTF-gamma (a secreted growth factor), 
and galanin receptor (a seven transmembrane G-coupled receptor). The primers used for each 
5' flanking locus is given below in Table 2 where the last 30 nts of each primer is specific for 
tibe 5' and 3' ends of the targeting fragment containing the Tk promoter driving hygromycin 
expression followed by the CMV promoter, while the 5' ends of each primer pair are specific 
to the 5 'flanking region of each locus, N-RAS (SEQ ID NO: 13 and 14); beta-globin (SEQ ID 
NO: 15 and 16); Interferon gamma (SEQ ID NO: 17 and 18); and galanin receptor (SEQ ID 
NO: 19 and 20). Transfected cells were first analyzed by RT-PCR analysis to identify 
increased steady-state gene expression using primer pairs that were enable of detecting 
spUced mRNA (primers listed in Table 3). These primer combinations can detect the 
endogenous gene expression of a target gene independent of LSTF integration. Expression 
analysis of transfected cells failed to reveal robust expression levels of any of these four loci in 
parental HEEC293 or control HEK293 cells transfected with the different fragments. 
Conversely, robust expression was observed for all targeted loci in transfected 293PMS134 
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cells containing the appropriate LSTF. A representative example is shown using cells where 
the beta-globin locus was targeted. HEK293 cells, which are derived from embryonic kidney 
have not been found to express the erytlTroid-specijBc beta-globin. Shown in Figure 2 is 
expression analysis of beta-globin using cDNA specific primers (SEQ ID NO:24 and SEQ ID 
NO:25, Table 3) in targeted cells containing the beta-globin LSTF, while none was observed 
in cells transfected with targeting vectors to other loci, which served as negative controls. An 
independent RT-PCR was carried using cDNA from the positive cultures using a 5' primer 
that was located in the distal leader sequence of he CMV promoter (SEQ ID NO: 21, Table 3) 
and a 3' primer located within the coding region of the beta-globin gene (SEQ ID NO: 25, 
Table 3). This primer set is only capable of producing a product with an expected molecular 
weight if the LSTF is integrated within the specific targeted locus because the resultant 
product consists of a hybrid transcript consisting of a cDNA comprised of a CMV leader fused 
to the initiating start codon for the targeted gene, which can only occur by correct genome 
integration for formation of this hybrid message. Similar results were found using targeting 
fragments to other chromosomal loci as well as using primers contaimng 50 nts of flanking 
sequence, whereas no locus specific expression was observed in HEK293 control cells 
transfected with similar fragments (data not shown). 


Table 2. Transfection construct primers. 


Gene 

5' primer name 

5' primer sequence 

3' primer name 

3' primer sequence 

N-Ras 

NRAS-564674 
(SEQIDN0:13) 

TTCAGAGTAGAAAACTAAATATOAT 
GAATAACTAAAAATAATTTCTCAAA 

ATCCCCGTGGCCCQTTGCTCCSCO 

NBAS-567492R 
(SEQIDNO:14) 

GCCCCAGTTGGACCCTG 
AOGTCGTACTCACCCCA 
ACAGCTCAGCGCCCCCT 
CTCCAGCGCCGCCATAA 
GCTACCCAGCTTCTAGA 
GATCTGACGGTTCAC 

P-globin 

HBB-59479 
(SEQ ID NO: 1 5) 

TGTGTGTGTGTTGTGGTCAGTGGGQ 
CTGGAATAAAAGTAGAATAGACCTG 
CACCTGCTGTGGCATCCATTCTGCTT 
CATCCCCGTQGCCXXJTTGCTCGCG 

HBB-62206R 
(SEQIDNO:16) 

TCAGGAGTCAGQTQCAC 
CATGGTQTCTOnTGAGG 

TTGCTAGTGAACACAGT 
TGTGTCAGAAGCAAATG 
TTACCCAGCTTCTAGAG 
ATCTGACGGTTCAC 
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INF-Y 

IFNG-1 626972 
(SEQIDNO:17) 

GTTCTCTGGACGTAATTnTCTTGAG 
CAGAGCAACAGTAGAGCTTTGTATG 
CAACAATGTAATTTTTACACTGCTTC 
ATCCCCGTGGCCCGTTGCTCGCG 

IFNG-1 629791 R 
(SEQIDN0:18) 

ATCAGGTCCAAAGGACT 
TAACTGATCTTTCTCTTC 
TAATAGCTGATCTTCAG 
ATGATCAGAACAATGTG 
CTACCCAOCTTCTAGAG 
ATCTOACGGTTCAC 

Galaninl 
Receptor 

GallR-283026F 
(SEQIDNO:19) 

lUGCAGGAGCGGAAGCAAGAGAGG 
GAAGGGAGGAGGTGCCACACACTTT 
CAAACAACCAGATCTTCAGACCTGC 
TTCATCCCCGTGGCCCGTTGCTCGCG 

GanR-280208R 
(SEQ ID NO:20) 

QCTCGGCTGAAATCCGC 
GCCCCTTAGAAGTCACG 

GTGCGCGAGCAGAGACT 
GGACGGATTCTAGCGGG 
ATTACCCAGCTTCTAGA 
GATCTGACGGTTCAC 


Table 3. RT-PCR primers. 



5> primer sequence 


3' primer scqaence 

(SEQIDN0:21) 

CAGATCTCTAGAAOCTOGOT 



Nras(SEQIDNO:22) 

ATGACTGAGTACAAACTGOTOGTGG 

Nras-R(SEQID 
N0.23) 

CATTCGGTACTGGCXiTATTrcTC 

Globin(SEQID 
N034) 

ATQGTGCACCTGACTCCTGAGGAO 

GIobinCSEQID 
N025) 

GTTGGACTTAGGGAACAAAGGA 
AC 

Glanin(SEQID 
NO:26) 

AlUClWrnAGCATCTTCACCCTC 

Glanin (SEQ ID 
NO:27) 

CTGAAGAGGAAGGAAGCCGGCd 
TC 

IFNg(SEQID 
NO:28) 

ATGAAATATACAAGTTATAimGGC 

IFNg(SEQ]D 
NO:29) 

CAGOACAACCATTACTGGGATGC 


[0132] Analysis of cell lines transfected with promoter-specific LSTFs can be carried out 
by any number of methods that measure levels of RNA or proteins. Such methods of analysis 
may include but are not limited to microarray analysis, in situ RT-PCR, Northern blot, western 
blotting, immunostaining, fluorescent Activated Cell Sorting, etc. Cell lines over expressing a 
gene of interest may be analyzed by functional assays using biological systems that are 
sensitive to the production of certain biochemicals of growth factors. These methods are 
routinely used by those skilled in the art of high throughput screening and are useful for 
analyzing the ejqjression levels of target genes in cells transfected with LSTFs. 
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Generation of exon locus-specific targeting fragments and cell lines. 

[0133] The ability to target an exon of a specific gene in any given host organism enables 
the generation of exon specific tags to monitor gene expression profiles of a target gene upon 
exposure to biological factors and/or pharmaceutical compounds. This application teaches the 
use of inhibitors of MMR in somatic cells that can enhance the recombination of jOragments 
with as little as 50 nts of homologous sequence to a chromosomal target within complex 
genomes including those derived of human materials (see above). To take advantage of the 
abihty to generate locus specific targets, we teach of the use of a exon locus specific targeting 
(LST) vectors that can be used to generate knock-ins within an exon of a specific locus, 
whereby the LST fragment contains a selectable marker fused to a reporter gene that can be 
used in combination with any number of analytical systems to monitor gene expression in situ 
or in vitro. An example of one such fusion cassette is presented in Figure 3, whereby ttie 
hygromycin resistance gene is fused in-frame with the luciferase gene. Using a similar 
strategy as described above, we generated a number of fusion expression cassettes that contain 
a selectable maker iused in-frame with a reporter gene. These vectors can consist of any 
selectable marker that can be used to select for stable transfomiants and any reporter gene that 
can be monitored to analyze expression levels of particular locus or loci. 
[0134] Exon LSTFs is generated by PGR using 80-100 nt primers that contain 50-70 nts of 
5' sequence that are homologous to the 5' and 3' boarders of a given gene's exon, while the 
tenninal 30nts are specific for the first and last codons of the fusion protein, such as those 
given as examples in Figure 3. PGR products are amplified from the pFusion plasmid, 
containing a DNA insert with Ihe selectable marker/reporter gene. PGRs are carried out using 
buffers as previously described (Grasso, L. et al. (1998) "Molecular analysis of human 
interleukin-9 receptor transcripts in peripheral blood mononuclear cells. Identification of a 
splice variant encoding for a nonfunctional cell surface receptor" J. Biol. Chem. 273:24016- 
24024). Amplification conditions consisted of one cycle of 95C for 5', 30 cycles of 94°C for 
30 seconds/47°C for 30 seconds/72°C for 1 minute, and one cycle of 72°C for 2 minutes. 
Primers pairs used for each exon LSTF are indicated in Table 4. LST fragments are analyzed 
by gel electrophoresis to ensure correct size. Reactions with correct size are then purified by 
spin column to remove primers from fragments 

[0135] Generation of stable cell lines with exon locus-specific targeted knock-in fragments 
are performed as follows. Briefly, 1x10^ MMR defective cells (stably cjqjressing the PMS134 
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gene (see Example 1) are transfected with 1 of purified PGR products firom above using 3 
Hl Fugene6 (Tnvitrogeii) and stable transfectant pools are generated by co-selection with 100 
M-g/ml hygromycin B and G418 (neomycin). Cultures are selected for 14 days in neomycin 
and hygromycin. Pools and clones are analyzed for locus specific integration using reverse 
transcriptase coupled PGR as described (Nicolaides, N.C. et al. (1997) "Merleukin 9: a 
candidate gene for asthma" Proc. Natl. Acad. Sci. USA 94:13175-13180). Briefly, 1x10^ 
hygromycin/neomycin resistant cells transfected with various PGR fragments are lysed in 50 
Hl lysis buffer containing tris-edta and NP40 and incubated 10 minutes on ice. Samples are 
added to oUgo d(T) tubes in the presence of 50 |al binding buffer and incubated 15' at RT with 
shaking. Lysates are aspirated and washed 2x each with high salt wash buffer followed by low 
salt wash buffer. 33 m,1s Ix First-strand cDNA mix containing NTPs and reverse transcriptase 
is added to tubes and incubated 1 hr at 37°C. 67 ^il of a dHaO/ TAQ mixture was aliquoted 
into each sample along with appropriate gene-specific primers that target sequences contained 
within the proceeding exon and a 3' primer that targets sequence proximal to the fusion 
integration site. A schematic description of the exon LSTF and PGR analysis for integration 
are shown in Figure 4. 


table 4: Primers for exon locus specific targeting fragments. The N(5o.7o) indicates sequence 
to be added to each primer for a specific exon. 


Fusion LSTF 

5' primer 

3' primer 

Hyg-GFP 

5' -N(5o-70)- atgaaaaagc 
ctgaactcaccgcgacgtct-3 ' 
(SEQ1DNO:30) 

5'-N(50.70)- 

ccatgtgtgtg-3' (SEQ ID 
NO:31) 

Hyg-Luc 

5' -N(5o.70)- atgaaaaagc 
ctgaactcaccgcgacgtct-3 ' 
(SEQIDNO:32) 

5'-N(5o.7o)-caatttggactttccg 
cccttcttggcctt-3' (SEQ ID 
NO:33) 


EXAMPLE 3: Generation of targeting cassettes for altered gene expression or tagged 
cliromosomes for site-specific gene amplification. 
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[0136] Another means for enhancing gene expression from the geaome of a host organism 
is through the process of gene amplification. A number of studies have reported the use of 
expression vectors consisting of a gene of interest linked to a DHFR expression cassette. 
Once the expression vector has been inserted into the genome of a host cell line, expression 
cassettes can be amphfied by selecting for clonal resistance to methotrexate, a process that 
occurs through gene amplification of the DHFR gene and surrounding proximal and distal loci 
(Ma, C. et al. (1993) "Sister chromatid fusion initiates amplification of the dihydrofolate 
reductase gene ki Chinese hamster cells" Genes Dev. 7:605-620). A method is taught here that 
employs the use of LSTFs in MMR defective cells via the use of MMR inhibitors, whereby the 
LSTF contains a constitutively expressed DHFR gene juxtaposed to selectable markers with 
the ends of the LSTF containing 50-70 bps of homologous sequence to an endogenous gene 
locus. The target site may be proximal, intragenic or distal to the target locus. Briefly, the 
LSTF is generated from a Hyg-DHFR cassette via PGR using the pHYG-DHFR vector as 
template. AmpUfications are generated using primers that are 5' to the TK promoter, which 
controls the HYG expression and a primer that is directed to the sequence 3' of the DHFR 
gene, which consists of the SV40polyA signal. Each primer contains 50-70 nts that are 
homologous to the chromosomal target site. Cells are transfected with a dominant negative 
MMR expression vector, which contains a neomycin resistance marker as described in 
Example 1 along with the LSTF. Upon cotransfection, cells are coselected in hygromycin and 
neomycin for 14 days. Cells are analyzed for chromosomal specific integration using primers 
that flank the targeted site of integration. Analysis can be in pooled cultures or in single 
clones. Upon confirmation of integration, cells are selected for chromosomal site-specific 
amplification by methotrexate (MTX) selection. Briefly, 1.0 x 10* cells are seeded in 10cm 
culture dishes with complete growth medium supplemented with 10% dialyzed fetal bovine 
serum 24 h prior to drug selection. Next, MTX is added at 15 times the calculated IC50 and the 
plates are incubated at 37°C. Cells are grown in the presence of continuous MTX selection for 
14 to 21 days. Colonies are selected and analyzed for DHFR and chromosome amplification. 
Analysis of genomic DNA is carried out using the modified salting out method. Briefly, cells 
are isolated from parental or MTX exposed clones. Cells are pelleted and lysed in 1 ml of lysis 
buffer (25mM Tris-HCl pH 8.0, 25 mM EDTA, 1% SDS, 0.5 mg/ml proteinase K). Cell 
lysates are incubated at 50°C 12 hrs to overnight. Following ethanol precipitation and 
resuspension, RNaseA was added to 100 |ig/ml and the mixture was kept at 37°C for 30 min. 
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Next, DNAs are phenol extracted and precipitated by the addition of 3 M NaOAc and ethanol. 
DNA pellets are washed once with 70% ethaaol, air-dried and resuspended in TE buffer. 
DNAs are digested with different restriction enzymes and probed for DHFR and the locus of 
interest for amplification as compared to the control cells. MMR activity is restored in 
amplified clones and the cells are used for experimentation or production. 
[0137] A benefit taught by this application is the combined use of MMR deficiency, 
enhanced homologous recombination with LSTFs and the ability to produce site-specific gene 
amplification within a host's genomic locus. Recently, a report by Lin, C.T. et al ((2001) 
"Suppression of gene amplification and chromosomal DNA integration by the DNA mismatch 
repair system" Nticl. Acid Res. 29:3304-3310) found the lack of MMR results in increased 
gene amplification using a reporter gene system. The ^oach taught here describes a method 
that allows for enhanced locus amplification within a specific chromosomal site a hosts 
genome. 

Discussion 

[0138] The results and observation described here lead to several conclusions. First, 
expression of PMS134 results in an increase in microsatellite instability in HEK293 through 
the:dominant negative blockage in mismatch repair. Second, that the inhibition of MMR in 
somatic cells can lead to increased rates of homologous recombination between short 
nucleotide sequences 50-70 nts in length. Finally, the combination of blocking MMR with 
dominant negative inhibitors such as polypeptides or chemical inhibitors can lead to a rapid 
process that can be used to genetically engineer somatic mammalian cells to alter the 
expression of a particular locus at the chromosomal level as well as tag exons of genes 
whereby the expression of a chromosomal locus can be monitored in response to biochemicals 
and pharmaceutical compound exposure. 

[0139] While previous reports have taught the use of inhibiting MMR can lead to 
increased homologous recombination with divergent sequences, this application teaches the 
use of employing MMR deficient somatic cell lines along with targeting fragments containing 
50-70 nts of homology to a gene locus to alter and/or monitor its expression. 
[0140] The blockade of MMR in cells to increase LSTF integration can be through the use 
of dominant negative MMR gene alleles from any species including bacteria, yeast, protozoa, 
insects, rodents, primates, mammalian cells, and man. Blockade of MMR can also be 
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generated through the use of antisense RNA or deoxynucleotides directed to any of the genes 
involved in the MMR. biochemical pathway. Bloclcade of MMR can be through the use of 
polypeptides that mterfere with subunits of the MMR complex including but not limited to 
antibodies. Finally, the blockade of MMR may be through the use of chemicals such as but 
not limited tononhydrolyzableATP analogs, which have been shown to block MMR (Galio, L. 
et al. (1999) "ATP hydrolysis-dependent fomiation of a dynamic ternary nucleoprotein 
complex with MutS and MutL" Nucl. Acids Res. 27:2325-2331; Spampinato, C. and P. 
Modrich (2000) "The MtitL ATPase is required for mismatch repair" J. Biol. Chem. 275:9863- 
9869. 
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What is claimed is: 

1 . A method of introducing a locus specific targeting fragment into the genome of a cell 
through homologous recombination comprising: 

inhibiting endogenous mismatch repair of said cell; 

introducing a locus specific targeting fragment into said cell; 
wherein said locus specific targeting fragment is a polynucleotide comprising at least one 
promoter, a selectable marker and 5' and 3' flanking regions of about 20 to about 120 
nucleotides; wherein said 5' and 3' flanking regions are homologous to a selected portion of 
the genome of said cell; and wherein said locus specific targeting fragment integrates into the 
genome of said cell by homologous recombination. 

2. • The method of claim I, ftirther comprising restoring mismatch repair activity of 
said cell. 

3. The method of claim 1, wherein said promoter is selected from the group 
consisting of a CMV promoter, an SV40 promoter, elongation factor, LTR sequence, a pIND 
promoter sequence, a tetracycline promoter sequence, and a MMTV promoter sequence. 

4. The method of claim 1 , wherein said selectable marker is selected from the group 
consisting of a hygromycin resistance gene, a neomycin resistance gene and a zeocin 
resistance gene. 

5. The method of claim 1, wherein said 5'and 3' flanking regions are about 30 to 
about 100 nucleotides in length. 

6 . The method of claim 1 , wherein said 5 ' and 3 ' flanking regions are about 40 to 

about 90 nucleotides in length. 

7. The metiiod of claim 1 , wherein said 5 ' and 3 ' flanking regions are about 50 to 
about 80 nucleotides in length. 
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8. The method of claim 1, wherein said 5' and 3' flan k ing regions are about 50 to 
about 70 nucleotides in length. 

9. The method of claim 1, wherein said cell is selected firom the group consisting of a 
vertebrate cell, an invertebrate cell, a mammaUau cell, a reptilian ceU, a fungal cell, and a 
yeast cell. 

1 0. The method of claim 1 , wherein said 5 ' and 3 ' flanking regions are homologous to a 
5 ' flanking region of a selected chromosomal locus of said cell. 

1 1 . The method of claim 1 wherein said mismatch repair is inhibited by introducing 
into said cell a dominant negative allele of a mismatch repair gene. 

12. The method of claim 1 1 wherein said mismatch repair geUe is selected from the 
group consisting of PMS2, PMSl, MSH2, MSH6, and MLHl. 

1 3 . The method of claim 1 1 wherein said mismatch repair gene is a PMS2 gene. 

1 4. The method of claim 1 3 wherein said PMS2 gene is selected from the group 
consistmg of a PMS2-134 gene, a PMSR2 gene, and a PMSR3 gene. 

1 5 . The method of claim 1 wherein mismatch repair is inhibited using a chemical 
inhibitor of mismatch repair. 

16. A method of genetically altering a cell to overproduce a selected polypeptide 
comprising: 

inhibiting endogenous mismatch repair of said cell; 

introduciag a locus specific targeting fragment into said cell; wherein said locus 
specific targeting fragment is a polynucleotide comprising at least one promoter sequence, a 
selectable marker and 5' and 3' flanking regions of about 20 to about 120 nucleotides, wherein 
said 5' and 3' flanking regions are homologous to a selected portion of the genome of said 
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cell, and wherein said locus specific targeting fragment integrates into the genome of said cell 
by homologous recombination; and 

selecting said cell that overproduces said selected polypeptide. 

17. The method of claim 1 6, further comprising restoring mismatch repair activity of 
said cell. 

18. The method of claim 16, wherem said promoter is selected from the group 
consisting of a CMV promoter, an SV40 promoter, elongation factor, LTR sequence, a pIND 
promoter sequence, a tetracycline promoter sequence, and a MMTV promoter sequence. 

19. The method of claim 16, wherein said selectable marker is selected from the group 
consisting of a hygromycin resistance gene, a neomycin resistance gene and a zeocin 
resistance gene. 

20. The method of claim 16, wherein said 5'and 3' flanking regions are about 30 to 
about 100 nucleotides in length. 

21. The method of claim 16, wherein said 5'and 3' flanking regions are about 40 to 
about 90 nucleotides in length. 

22. The method of claim 16, wherein said 5' and 3 ' flanking regions are about 50 to 
about 80 nucleotides in length. 

23 . The method of claim 1 6, wherein said 5 ' and 3 ' fl ankin g regions are 50 to70 
nucleotides in length. 

24. The method of claim 16, wherein said cell is selected from the group consisting of 
a vertebrate cell, an invertebrate cell, a mammalian cell, a reptihan cell, a fungal cell, and a 
yeast cell. 
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25. The method of claim 16, wherein said 5'and 3' flanking regions are homologous to 
the 5 ' flanking region of a selected chromosomal locus of said cell. 

26. The method of claim 16 wherein said mismatch repair is inhibited by administering 
to said cell a polynucleotide comprising a do'miiiant negative mismatch repair gene. 

27. The method of claim 16 wherein said mismatch repair gene is selected from the 
group consistmg of PMS2, PMSl, MSH2, MSH6, srAMLHl. 

28 . The method of claim 26 wherein said mismatch repair gene is a PMS2 gene. 

29. The method of claim 28 wherein said PMS2 gene is selected firom the group 
consisting of a PMS2-134 gene, a PMSR2 gene, and a PMSR3 gene. 

30. The method of claim 16 wherein mismatch repair is inhibited using a chemical 
inhibitor of mismatch repair. 

31. A method of tagging an exon of a cell for screening gene expression in response to 
biochemical or pharmaceutical compounds comprising: 

inhibiting endogenous mismatch repair of said cell; and 
introducing a locus specific targetuig firagment into said cell; 
wherein said locus specific targeting fragment is a polynucleotide comprising a reporter 
element, a selectable marker and 5' and 3' flanking regions of about 20 to about 120 
nucleotides, wherein said 5' and 3' flanking regions are homologous to a selected portion of 
the genome of said cell; wherein said locus specific targeting fragment integrates within a 
targeted gene's exon by homologous recombination; and wherein said cells containing genes 
with tagged exons are used for screening gene expression in response to biochemical or 
pharmaceutical conq)ounds. 

32. The method of claim 31, ftirther comprising restoring mismatch repair activity of 

said cell. 
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33 . The method of claim 3 1 , wherein said reporter element is selected from the group 
consisting of luciferase and green fluorescent protein. 

34. The method of claim 3 1 , wherein said selectable marker is selected from the group 
consisting of ahygromycin resistance gene, a neomycin resistance gene, and a zeocm 
resistance gene. 

35. The method of claim 31, wherein said reporter element is fused m frame to said 
selectable marker. 

36. The method of claim 31, wherein said 5'and 3' flanking regions are about 30 to 
about 100 nucleotides in length. 

37. The method of claim 31, wherein said 5'and 3' flanking regions are about 40 to 
about 90 nucleotides in length. 

38. The method of claim 31, wherein said 5' and 3' flanking regions are about 50 to 
about 80 nucleotides in length. 

39. The method of claim 3 1, wherein said 5' and 3' flanking regions are 50 to about 70 
nucleotides in length. 

40. The method of claim 3 1, wherein said cell is selected from the group consisting of 
a vertebrate cell, an invertebrate cell, a mammalian cell, a reptilian cell, a fungal cell, and a 
yeast cell. 

41. The method of claim 31; wherein said 5'and 3' flanking regions are homologous to 
the 5' flanking region of a selected chromosomal locus of said ceU. 

42. The method of claim 3 1 wherein said mismatch repair is inhibited by administering 
to said cell a polynucleotide comprismg a dominant negative mismatch repair gene. 
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43 . The method of claim 3 1 wherein said mismatch repair gene is selected from the 
group consisting of PMS2, PMSl, MSH2, MSH6, and MLHL 

44. The method of claim 42 wherein said mismatch repair gene is a PMS2 gene. 

45. The method of claim 44 wherein said PMS2 gene is selected from the group 
consisting of a PMS2-134 gene, a PMSR2 gene, and a PMSR3 gene. 

46 . The method of claim 3 1 wherein mismatch repair is inhibited using a chemical 
inhibitor of mismatch repair. 

47. A method of tagging a specific chromosomal site for locus-specific gene 
amplification comprising: 

inhibiting endogenous mismatch repair of said cell; and 
introducing a locus specific targeting fragment into said cell; 
wherein said locus specific targeting fragment is a polynucleotide comprising, operatively 
linked: a dihydrofolate reductase gene, a promoter, and 5' and 3' flanking regions of about 20 
to about 120 nucleotides, wherein said 5' and 3' flanking regions are homologous to a selected 
portion of the genome of said cell; wherein said locus specific targeting fragment integrates 
into the genome of said cell by homologous recombination; and wherein said specific 
chromosomal site is tagged for locus specific gene amplification. 

48. The method of claim 47, fiurther comprising restoring mismatch repair activity of 
said cell. 

49 . The method of claim 47 wherein said locus specific targeting fragment fiirther 
comprises a selectable marker and a second promoter operatively linked to said selectable 
marker. 

50. The method of claim 47, wherein said promoter is selected from the group 
consisting of a CMV promoter, an SV40 promoter, elongation factor, LTR sequence, a pIND 
promoter sequence, a tetracycline promoter sequence, and a MMTV promoter sequence, 
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5 1 . The method of claim 47, wherein said selectable marker is selected from the group 

consisting of a hygroniycin resistance gene, a neomycin resistance gene, and a zeocin 


52. The method of claim 47, wherein said 5'and 3' flanking regions are about 30 to 
about 100 nucleotides in lengtti. 

53. The method of claim 47, wherein said 5'and 3' flanking regions are about 40 to 
about 90 nucleotides in length. 

54. The method of claim 47, wherein said 5' and 3' flanking regions are about 50 to 
about 80 nucleotides in length. 

55 . The method of claim 47, wherein said 5 ' and 3 ' flanking regions are 50 to about 70 

nucleotides in length, 

56. The method of claim 47, wherein said cell is selected from the group consisting of 
a vertebrate cell, an invertebrate cell, a mammalian cell, a reptilian cell, a fungal cell, and a 
yeast cell. 

57. The method of claim 47, wherein said 5'and 3' flanking regions are homologous to 
the chromosomal region of a target gene. 

58. The method of claim 47 wherein said mismatch repair is inhibited by administering 
to said cell a polynucleotide comprising a dominant negative mismatch repair gene. 

59. The method of claim 47 wherein said mismatch repair gene is selected from the 
group consisting of PMS2, PMSl, MSH2, MSH6, mdMLHl. 

60. The method of claim 47 wherein said mismatch repair gene is a PMS2 gene. 
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61. The method of claim 60 wherein said PMS2 gene is selected from the group 
consisting of a PMS2-134 gene, a PMSR2 gene, and a PMSR3 gene. 

62. The method of claim 47 wherein mismatch repair is inhibited using a chemical 
inhibitor of mismatch repair. 

63. A locus specific targeting fragment comprising: a dihydrofolate reductase gene 
operatively linked to a promoter, and 5' and 3 ' flanking regions of about 20 to about 120 
nucleotides wherein said 5' and 3' flanking sequences are homologous to a selected portion of 
a genome of a cell. 

64. The locus specific targeting fragment of claim 63 ftirther comprising a selectable 
marker operatively linked to a second promoter sequence. 

65 . The locus specific targeting fragment of claim 64 ftirther comprising an IRES 
sequence between said dihydrofolate reductase gene and said selectable marker. 

66. The locus specific targeting fragment of claim 63, wherein said 5' and 3' flanking 
regions are about 30 to about 100 nucleotides in length. 

67. The locus specific targeting fragment of claim 63, wherein said 5' and 3' flanking 
regions are about 40 to about 90 nucleotides in length. 

68. The locus specific targeting fragment of claim 63, wherein said 5 ' and 3 ' flanking 
regions are about 50 to about 80 nucleotides in length. 

69. The locus specific targeting fragment of claim 63, wherein said 5' and 3' flanking 
regions are 50 to about 70 nucleotides in length. 

70. A locus specific targeting fragment comprising: a reporter element, a selectable 
marker operatively linked to a promoter, and 5' and 3' flanking regions of about 20 to about 
120 nucleotides. 
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7 1 . The locus specific targeting fragment of claim 70, wherein said 5 ' and 3 ' flanking 
regions are about 30 to about 100 nucleotides in length. 

72. The locus specific targeting fragment of claim 70, wherein said 5' and 3' flanking 
regions are about 40 to about 90 nucleotides in length. 

73. The locus specific targeting fragment of claim 70, wherein said 5' and 3' flanking 
regions are about 50 to about 80 nucleotides in length. 

74. The locus specific targeting fragment of claim 70, wherein said 5' and 3' flanking 
regions are 50 to about 70 nucleotides in length. 

75 . A locus specific targeting fragment comprising: at least one promoter sequence, a 
selectable marker and 5' and 3' flanking regions of about 20 to about 120 nucleotides. 

76. The locus specific targeting fragment of claim 75, wherein said 5' and 3' flanking 
regions are about 30 to about 100 nucleotides in length. 

77. The locus specific targeting fragment of claim 75, wherein said 5 ' and 3 ' flanking 
regions are about 40 to 90 nucleotides in length. 

78. The locus specific targeting fragment of claim 75, wherein said 5 ' and 3 ' flanking 
regions are about 50 to about 80 nucleotides in length. 

79. The locus specific targeting fi-agment of claim 75, wherem said 5' and 3' flanking 
regions are about 50 to about 70 nucleotides in length. 

80. A method of producmg a locus specific targeting fiiagment comprising amplifying a 
nucleic acid construct comprising a promoter and a selectable marker with a 5' and 3' primer 
in a polymerase chain reaction, wherein said 5' primer comprises about 20 to about 120 
nucleotides fliat are homologous to a portion of flie genome of a cell positioned 5' of a target 
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locus, and wherein said 3' piimea: comprises about 20 to about 120 nucleotides that are 
homologous to a portion of the genome of a cell positioned 3' of said target locus. 

81 . The method of claim 80 wherein said nucleic acid construct fttrther comprises a 
second protein encoding sequence operatively litiked to a second promoter. 

82. The method of claim 80 wherein ssdd second protein encoding sequences is a 
dihydrofolate reductase sequence. 

83. A method of introducing a locus specific targeting fragment into the genome of a 
cell through homologous recombination comprising: introduciag a locus specific targeting 
fragment into a mismatch repair-deficient cell; wherein said locus specific targeting fragment 
is a polynucleotide comprising a nucleic acid sequence to be incorporated into the genome of 
said mismatch repair deficient cell; wherein said polynucleotide comprises portions of about 
20 to about 120 nucleotides, each flanking the 5' and 3' portion of said nucleic acid sequence 
to be incorporated into said genome; wherein said 5' and 3' flanking regions are homologous 
to a selected portion of the genome of said cell; and wherein said locus specific targeting 
fragment integrates into the genome of said mismatch repair deficient cell by homologous 
recombination. 

84. The method of claim 83 fiirther comprising the step of selecting said cells based on 
resistance to methotrexate. 

85. The method of claim 83 wherein said locus specific targeting fragment fiirther 
comprises an operatively positioned locus control region. 
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Tlcpro 
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sv4o polyA 
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Fig. 2 
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A: Hygromycin-green fluorescent fusion protein (F 

1 mickpcltata vBkfllakfd ■VBdlmqlse 
61 ryvyrhfana alpipavXdl gefaaalbye 
121 aiaaadlaqt agfgpfgpqg igqyttwrdf 
181 lalwaedepa vrU-viiadfg annvlbdngr 
241 emeqqtry-fe rrhpalagap rlraymlrlg 
301 agtvgrtqia rraaawtdg cvavladsgn 
3S1 veldgdvhgh kfsvrgegeg dadygkleik 
421 ehmkmndffk sampegyiqe rtiffqddgk 
481 ilghkleynf nahnvyimpd kannglkvnf 
541 piahylstqt aiskdmetr dhmvfleffs 


lyg-GFP) (SEQ ID NO: 46) 

gaearafafd vggrgyvlrv naeadgCykd 
iarraqgvtl qdlpatalpa vlqpvaaaaid 
leaiadphvy hwqtvnddtv aaavaqalda 
Itavidwaaa nfgdaqyava aiefwrpwla 
Idqlyqalvd gafddaawaq grcdalvrag 
rrpatrpdra mgaanmskge elftgwpil 
f icttgklpv pwptlvttlg ygilcf aryp 
yktrgevkfe gdtlvnriel kgmdfkedgn 
kirhnieggg vgladhyqta vplgdgpvli 
acghthgmde lyk 


Fig. 3A 
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B: Hygromycin-luciferasc fusion protein (Hyg-Luc) (SEQ ID NO: 47) 

1 mkkpeltats velcfliakfd •vadlinqlae g«esraC«£d vggrgyvlrv nacadgfykd 
61 ryvyrhfasa alpip«vldi gafsaaltyc Isrraqgvtl qdlpatalpa vlqpvaaMnd 
121 aiaaadlaqt sgfgpfgpqg igqyttwrdt icaiadphvy hwqtvmddtv aaavaqalde 
181 Imlwaadcp* vrhlvhadfg snnvltdngr itavidwaaa mfgdaqyava nlffwrpwla 
241 cmeqqtryfa crhpalagap rlraymlrlg Idqlyqalvd gnfddaawaq gredalvrag 
301 agtvgrtqla rraaavwtdg evevladsgn rxpatxpdr* ngaanmedak nikkgpapfy 
361 pledgtageq Ihkamkryal vpgtiaftda hievnltyae yfemsvrlae amkryglntn 
421 hriwcsena Iqffmpvlga Ifigvavapa ndiynerell nsmnisqptv vfvskkglqk 
481 ilnvqkklpi iqkiiimdsk tdyqgfqsmy tfvtshlpi)g fneydfvpes fdrdktiali 
541 nmssgstglp kgvalphrta cvrfshardp ifgnqiipdt ailswpfhh gfginfttlgy 
601 licgfrwlm yrfeeelflr slqdykiqsa llvptlfsff akstlidkyd Isnlheiasg 
S61 gapiskevge avakrfhlpg irqgygltet tsailitpeg ddkpgavgkv vpffeakwd 
721 Idtgktlgvn qrgelcvrgp mimsgyvimp eatnalidkd gwlhsgdiay wdedehff iv 
781 drlkslikyk gyqvapaele aillqhpnif dagvaglpdd dagelpaaw vlehgktmte 
341 keivdyvasq vttakklrgg wfvdevpkg Itgkldarki reilikakkg gkskl 


Fig. 3B 
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□ 


GeneX 


1 2 3 4 

region J I B B 


Primer set A 


Fig. 4 


^0 03/062435 PCT/US03/01361 
SEQOENCE LISTING 

<110> Morphotek Inc. 
Grasso, Luigi 
Kline, J. Bradford 
Hicolaides, Nicholas C. 
Sass, Philip M. 

<120> Method for Generating Engineered Cells for Locus Specific Gene 
Regulation and Analysis 

<130> MG0003 PCT (MOR-0141) 


<160> 47 

<170> Patentin version 3.2 

<210> 1 

<211> 859 

<212> PRT 

<213> Mus mus cuius 

j<400> 1 

Met Glu Gin Thr Glu Gly Val Ser Thr Glu Cys Ala Lys Ala lie Lys 
1 5 10 IS 


Pro 


He Asp Gly Lys Ser Val His Gin He Cys Ser Gly Gin Val He 


Leu Ser Leu Ser Thr Ala Val Lys Glu Leu He Glu Asn Ser Val Asp 
35 40 

Ala Gly Ala Thr Thr He Asp Leu Arg Leu Lys Asp Tyr Gly Val Asp 
50 55 60 

Leu He Glu Val Ser Asp Asn Gly Cys Gly Val Glu Glu Glu Asn Phe 
65 80 

Glu Gly Leu Ala Leu Lys His His Thr Ser Lys He Gin Glu Phe Ala 
85 90 

ASP Leu Thr Gin Val Glu Thr Phe Gly Phe Arg Gly Glu Ala Leu Ser 
100 110 

Ser Leu Cys Ala Leu Ser Asp Val Thr He Ser Thr Cys His Gly Ser 
1X5 120 125 

Ala Ser Val Gly Thr Arg Leu Val Phe Asp His Asn Gly Lys He Thr 
130 135 140 

Gin Lys Thr Pro Tyr Pro Arg Pro Lys Gly Thr Thr Val Ser Val Gin 
145 150 155 

His Leu Phe Tyr Thr Leu Pro Val Arg Tyr Lys Glu Phe Gin Arg Asn 
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Leu Ser Thr Ser Gly Arg His Lys Thr Phe Ser Thr Phe Arg Ala Ser 
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Tyr Arg Gly Leu Arg Gly Ser Gin Asp Lys Leu Val Ser Pro Thr Asp 


Leu Ser Ser Thr Ser Ala Gly Ser Glu Glu Glu Phe Ser Thr Pro Glu 
500 505 510 

Val Ala ser Ser Phe Ser Ser Asp Tyr Asn Val Ser Ser Leu Glu Asp 
515 520 525 

Ara Pro Ser Gin Glu Thr lie Asn Cys Gly Asp Leu Asp Cys Arg Pro 
530 535 540 

Pro Gly Thr Gly Gin Ser Leu Lys Pro Glu Asp His Gly Tyr Gin Cys 


Lys Ala Leu Pro Leu Ala Arg Leu Ser Pro Thr Asn Ala Lys Arg Phe 
565 570 57b 

i Thr Glu Glu Arg Pro Ser Asn Val Asn lie Ser Gin Arg Leu Pro 


595 

Asn Lys Arg lie Val Leu Leu Glu Phe Ser LeU Ser Ser Leu Ala Lys 
510 615 620 

Arg Met Lys Gin Leu Gin His Leu Lys Ala Gin Asn Lys His Glu Leu 
625 630 635 

ser Tyr Arg Lys Phe Arg Ala Lys He Cys Pro Gly Glu Asn Gin Ala 


Ala Glu Asp Glu Leu Arg Lys Glu He Ser Lys Ser Met Phe Ala Glu 
660 665 070 

Met Glu He Leu Gly Gin Phe Asn Leu Gly Phe He Val Thr Lys Leu 
675 680 585 

Lys Glu Asp Leu Phe Leu Val Asp Gin His Ala Ala Asp Glu Lys Tyr 


Asn Phe Glu Met Leu Gin Gin His Thr Val Leu Gin Ala Gin Arg Leu 


705 


710 -715 


He Thr Pro Gin Thr Leu Asn Leu Thr Ala Val Asn Glu Ala Val Leu 


lie Glu Asn Leu Glu He Phe Arg Lys Asn Gly Phe Asp Phe Val He 
740 745 '30 
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Asp Glu Asp Ala Pro Val Thr Glu Arg Ala Lys Leu He Ser Leu Pro 
755 760 765 

Thr Ser Lys Asn Trp Thr Ehe Gly Pro Gin Asp He Asp Glu Leu He 
770 775 780 

Ehe Met Leu Ser Asp Ser Pro Gly Val Met Cys Arg Pro Ser Arg Val 
785 790 795 800 

Arg Gin Met Phe Ala Ser Arg Ala Cys Arg Lys Ser Val Met He Gly 
805 810 815 

Thr Ala Leu Asn Ala Ser Glu Met Lys Lys Leu He Thr His Met Gly 

820 825 830 

Glu Met Asp His Pro Trp Asn Cys Pro His Gly Arg Pro Thr Met Arg 


His Val Ala Asn Leu Asp Val He Ser Gin Asn 
B50 855 

<210> 2 

<211> 3056 

<212> DNA 

<213> Mus iraisculus 

<400> 2 

gaattccggt gaaggtcctg aagaatttcc agattcctga gtatcattgg aggagacaga 60 

taacctgtcg tcaggtaacg atggtgtata tgcaacagaa atgggtgttc ctggagacgc 120 

gtcttttccc gagagcggca ccgcaactct cccgcggtga ctgtgactgg aggagtcctg 180 

catccatgga gcaaaccgaa ggcgtgagta cagaatgtgc taaggccatc aagcctattg 240 

atgggaagtc agtccatcaa atttgttctg ggcaggtgat actcagttta agcaccgctg 300 

tgaaggagtt gatagaaaat agtgtagatg ctggtgctac tactattgat ctaaggctta 360 

aagactatgg ggtggacctc attgaagttt cagacaatgg atgtggggta gaagaagaaa 42 0 

actttgaagg tctagctctg aaacatcaca catctaagat tcaagagttt gccgacctca 480 

cgcaggttga aactttcggc tttcgggggg aagctctgag ctctctgtgt gcactaagtg 540 

atgtcactat atctacctgc cacgggtctg caagcgttgg gactcgactg gtgtttgacc 600 

ataatgggaa aatcacccag aaaactccct acccccgacc taaaggaacc acagtcagtg 660 

tgcagcactt attttataca ctacccgtgc gttacaaaga gtttcagagg aacattaaaa 720 

aggagtattc caaaatggtg caggtcttac aggcgtactg tatcatctca gcaggcgtcc 780 

gtgtaagctg cactaatcag ctcggacagg ggaagcggca cgctgtggtg' .tgcacaagcg 840 

gcacgtctgg catgaaggaa aatatcgggt ctgtgtttgg ccagaagcag ttgcaaagcc 900 

tcattccttt tgttcagctg ccccctagtg acgctgtgtg tgaagagtac ggcctgagca 960 

cttcaggacg ccacaaaacc ttttctacgt ttcgggcttc atttcacagt gcacgcacgg 1020 

cgccgggagg agtgcaacag acaggcagtt tttcttcatc aatcagaggc cctgtgaccc 1080 
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agcaaaggtc tctaagcttg tcaatgaggt tttatcacat gtataaccgg catcagtacc 
catttgtcgt ccttaacgtt tccgttgact cagaatgtgt ggatattaat gtaactccag 
ataaaaggca aattctacta caagaagaga agctattgct ggccgtttta aagacctcct 
tgataggaat gtttgacagt gatgcaaaca agcttaatgt caaccagcag ccactgctag 
atgttgaagg taacttagta aagctgcata ctgcagaact agaaaagcct gtgccaggaa 
agcaagataa ctctccttca ctgaagagca cagcagacga gaaaagggta gcatccatct 
ccaggctgag agaggccttt tctcttcatc ctactaaaga gatcaagtct aggggtccag 
agactgctga actgacacgg agttttccaa gtgagaaaag gggcgtgtta tcctcttatc 
cttcagacgt catctcttac agaggcctcc gtggctcgca ggacaaattg gtgagtccca 
cggacagccc tggtgactgt atggacagag agaaaataga aaaagactca gggctcagca 
gcacctcagc tggctctgag gaagagttca gcacccoaga agtggccagt agctttagca 
gtgactataa cgtgagctcc ctagaagaca gaccttctca ggaaaccata aactgtggtg 
acctggactg ccgtcctcca ggtacaggac agtccttgaa gccagaagac catggatatc 
aatgcaaagc tctacctcta gctcgtctgt cacccacaaa tgccaagcgc ttcaagacag 
aggaaagacc ctcaaatgtc aacatttctc aaagattgcc tggtcctcag agcacctcag 
cagctgaggt ogatgtagcc ataaaaatga ataagagaat cgtgctcctc gagttctctc 
tgagttctct agctaagcga atgaagcagt tacagcacct aaaggcgcag aacaaacatg 
aactgagtta cagaaaattt agggccaaga tttgccctgg agaaaaccaa gcagcagaag 
atgaactcag aaaagagatt agtaaatcga tgtttgcaga gatggagatc ttgggtcagt 
ttaacctggg atttatagta accaaactga aagaggacct cttcctggtg gaccagcatg 
ctgcggatga gaagtacaao tttgagatgc tgcagcagca cacggtgctc caggcgcaga 
ggctcatcac accccagact ctgaacttaa ctgctgtcaa tgaagctgta ctgatagaaa 
atctggaaat attcagaaag aatggctttg actttgtcat tgatgaggat gctccagtca 
ctgaaagggc taaattgatt tccttaccaa ctagtaaaaa ctggaccttt ggaccccaag 
atatagatga actgatcttt atgttaagtg acagccctgg ggtcatgtgc cggccctcac 
gagtcagaca gatgtttgct tccagagcct gtcggaagtc agtgatgatt ggaacggcgc 
tcaatgcgag cgagatgaag aagctcatca cccacatggg tgagatggac cacccctgga 
actgccccca cggcaggcca accatgaggc acgttgccaa tctggatgtc atctctcaga 
actgacacac cccttgtagc atagagttta ttacagattg ttcggtttgc aaagagaagg 
ttttaagtaa tctgattatc gttgtacaaa aattagcatg ctgctttaat gtactggatc 
catttaaaag cagtgttaag gcaggcatga tggagtgttc ctctagctca gctacttggg 
tgatccggtg ggagotcatg tgagcccagg actttgagac cactccgagc cacattcatg 
agactcaatt caaggacaaa aaaaaaaaga tatttttgaa gccttttaaa aaaaaa 


1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3056 


<210> 3 
<211> 932 
<212> PRT 
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<213> Homo sapiens ' 
<400> 3 

Met Lys Gin Leu Pco Ala Ala Thr Val Arg Leu Leu Ser Ser Ser Gin 
15 10 15 

lie lie Thr Ser Val Val Ser Val Val Lys Glu Leu lie Glu Asn Ser 
20 25 30 

Leu Asp Ala Gly Ala Thr Ser Val Asp Val Lys Leu Glu Asn Tyr Gly 
35 40 45 

Phe Asp Lys lie Glu Val Arg Asp Asn Gly Glu Gly lie Lys Ala Val 
50 55 60 

Asp Ala Pro Val Met Ala Met Lys Tyr Tyr Thr Ser Lys He Asn Ser 
65 70 75 80 

His Glu Asp Leu Glu Asn Leu Thr Thr Tyr Gly Phe Arg Gly Glu Ala 


Thr Ala Lys Lys Cys Lys Asp Glu He Lys Lys He Gin Asp Leu Leu 
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Leii Lys Leu He Arg His His Tyr Asn Leu Lys Cys Leu Lys Glu Ser 

Thr Arg Leu Tyr Pro Val Phe Phe Leu Lys He Asp Val Pro Thr Ala 
290 295 300 

ASP Val ASP Val Asn Leu Thr Pro Asp Lys Ser Gin Val Leu Leu Gin 
305 310 315 J'iu 

ASH Lvs Glu Ser Val Leu He Ala Leu Glu Asn Leu Met Thr Thr Cys 
325 330 

Tyr Gly Pro Leu Pro Ser Thr Asn Ser Tyr Glu Asn Asn Lys Thr Asp 
340 345 

val ser Ala Ala Asp He Val Leu Ser Lys Thr Ala Glu Thr Asp Val 
355 360 365 

Leu Ehe Asn Lys Val Glu Ser Ser Gly Lys Asn Tyr Ser Asn Val Asp 
370 375 380 

Thr ser Val He Pro Phe Gin Asn Asp Met His Asn Asp Glu Ser Gly 
385 390 395 


Gly Tyr Gly His Cys Ser Ser Glu He Ser Asn He Asp Lys Asn Thr 
420 425 ^-^u 

Lys Asn Ala Phe Gin Asp He Ser Met Ser Asn Val Ser Trp Glu Asn 
435 

ser Gin Thr Glu Tyr Ser Lys Thr Cys Phe He Ser Ser Val Lys His 
450 455 460 

Thr Gin Ser Glu Asn Gly Asn Lys Asp His He Asp Glu Ser Gly Glu 
465 470 

Asn Glu Glu Glu Ala Gly Leu Glu Asn Ser Ser Glu He Ser Ala Asp 
485 490 

Glu Trp ser Arg Gly Asn He Leu Lys Asn Ser Val Gly Glu Asn He 
500 505 

Glu Pro val Lys He Leu Val Pro Glu Lys Ser Leu Pro Cys Lys Val 
515 520 '^'^^ 

ser Asn Asn Asn Tyr Pro He Pro Glu Gin Met Asn Leu Asn Glu Asp 

530 

ser Cys Asn Lys Lys Ser Asn Val He Asp Asn Lys Ser Gly Lys Val 
545 550 ■=>^^ 
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Thr Ala Tyr Asp Leu Leu Ser Asn Rrg Val He Lys Lys Pro Met Ser 
565 570 575 

Ala Ser Ala Leu Phe Val Gin Asp His flrg Pro Gin Phe Leu He Glu 
580 585 590 

Asn Pro Lys Thr Ser Leu Glu Asp Ala Thr Leu Gin He Glu Glu Leu 
595 600 605 

Trp Lys Thr Leu Ser Glu Glu Glu Lys Leu Lys Tyr Glu Glu Lys Ala 
610 615 620 


Gin Glu Ser Gin Met Ser Leu Lys Asp Gly Arg Lys Lys lie Lys Pro 
545 650 655 

Thr Ser Ala Trp Asn Leu Ala Gin Lys His Lys Leu Lys Thr Ser Leu 
660 665 670 

Ser Asn Gin Pro Lys Leu Asp Glu Leu Leu Gin Ser Gin He Glu Lys 
675 680 sa.S 


Arg Arg Ser Gin Asn He Lys Met Val Gin He Pro Phe Ser Met Lvs 

590 695 700 

Asn Leu Lys He Asn Phe Lys Lys Gin Asn Lys Val Asp Leu Glu Glu 

705 710 715 720 

Lys Asp Glu Pro Cys Leu He His Asn Leu Arg Phe Pro Asp Ala Trp 


Glu Glu Ala Leu Leu Phe Lys Arg Leu Leu Glu Asn His Lys Leu I 
■755 760 765 


Gly Ser His Tyr Leu Asp Val Leu Tyr Lys Met Thr Ala Asp Asp Gin 
790 795 800 


Gly Phe Lys He Lys Leu He Pro Gly Val Ser He Thr Glu Asn Tyr 


Leu Glu He Glu Gly Met Ala Asn Cys Leu Pro Phe Tyr Gly Val Ala 
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ASP Leu Lys Glu He Leu Asn Ala He Leu Asn Arg Asn Ala Lys Glu 
850 

Val Tyr Glu Cys Arg Pro Arg Lys Val He Ser Tyr Leu Glu Gly Glu 


Ala val Arg Leu Ser Arg Gin Leu Pro Met Tyr Leu Ser Lys Glu Asp 
, He He Tyr Arg Met Lys His Gin Phe Gly Asn Glu He 


Lys Glu cys Val His Gly Arg Pro Phe Phe His His Leu Thr Tyr Leu 
915 920 


<210> 4 
<211> 2771 
<212> DNA 
<213> Homo sapiens 

cgaggcggat cgggtgttgc atccatggag cgagctgaga gctcgagtac agaacctgct 
aaggccatca aacctattga tcggaagtca gtccatcaga tttgctctgg gcaggtggta 
ctgagtctaa gcactgcggt aaaggagtta gtagaaaaca gtctggatgc tggtgccact 
aatattgatc taaagcttaa ggactatgga gtggatctta ttgaagtttc agacaatgga 
tgtggggtag aagaagaaaa cttcgaaggc ttaactctga aacatcacac atctaagatt 
caagagtttg ccgacctaac tcaggttgaa acttttggct ttcgggggga agctctgagc 
tcactttgtg cactgagcga tgtcaccatt tctacctgcc acgcatcggc gaaggttgga 
actcgactga tgtttgatca caatgggaaa attatccaga aaacccccta cccccgcccc 
agagggacca cagtcagcgt gcagcagtta ttttccacac tacctgtgcg ccataaggaa 
> tttcaaagga atattaagaa ggagtatgcc aaaatggtcc aggtcttaca tgcatactgt 
atcatttcag caggcatccg tgtaagttgc accaatcagc ttggacaagg aaaacgacag 
cctgtggtat gcacaggtgg aagccccagc ataaaggaaa atatcggctc tgtgtttggg 
cagaagcagt tgcaaagcct cattcctttt gttcagctgc cccctagtga ctccgtgtgt 
gaagagtacg gtttgagctg ttcggatgct otgcataatc ttttttacat ctcaggtttc 
atttcacaat gcacgcatgg agttggaagg agttcaacag acagacagtt tttctttatc 900 
aaccggcggc cttgtgaccc agcaaaggtc tgcagactcg tgaatgaggt ctaccacatg 
tataatcgac accagtatoc atttgttgtt cttaacattt ctgttgatto agaatgcgtt 
gatatcaatg ttactccaga taaaaggcaa attttgctac aagaggaaaa gcttttgttg 
gcagttttaa agacctcttt gataggaatg tttgatagtg atgtcaacaa gctaaatgtc 
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360 
420 
480 
540 
600 
660 
720 
780 
840 


960 
1020 
1080 
1140 
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agtcagcagc cactgctgga tgttgaaggt aacttaataa aaatgcatgc agcggatttg liiuo 

gaaaagccca tggtagaaaa gcaggatcaa tccccttcat taaggactgg agaagaaaaa 1250 

aaagacgtgt ccatttccag actgcgagag gccttttctc ttcgtcacac aacagagaac 1320 

aagcctcaca gcccaaagac tccagaacca agaaggagcc ctctaggaca gaaaaggggt 1380 

atgctgtctt ctagcacttc aggtgccatc tctgacaaag gcgtcctgag acctcagaaa 1440 

gaggcagtga gttccagtca cggacccagt gaccctacgg acagagcgga ggtggagaag 1500 

gactcggggc acggcagcac ttccgtggat tctgaggggt tcagcatccc agacacgggc 1560 

agtcactgca gcagcgagta tgcggccagc tccccagggg acaggggctc gcaggaacat 1620 

gtggactctc aggagaaagc gcctgaaact gacgactctt tttcagatgt ggactgccat 1680 

tcaaaccagg aagataccgg atgtaaattt cgagttttgc ctcagccaac taatctcgca 1740 

accccaaaca caaagcgttt taaaaaagaa gaaattcttt ccagttctga catttgtcaa 1800 

aagttagtaa atactcagga catgtcagcc tctcaggttg atgtagctgt gaaaattaat 1860 

aagaaagttg tgcccctgga cttttctatg agttctttag ctaaacgaat aaagcagtta 1920 

catcatgaag oacagcaaag tgaaggggaa cagaattaca ggaagtttag ggcaaagatt 1980 

tgtcctggag aaaatcaagc agccgaagat gaactaagaa aagagataag taaaacgatg 2040 

tttgcagaaa tggaaatcat tggtcagttt aacctgggat ttataataac caaactgaat 2100 

gaggatatct tcatagtgga ccagcatgoc acggacgaga agtataactt cgagatgctg 2160 

cagcagcaca ccgtgctcca ggggcagagg ctcatagcac ctcagactct caacttaact 2220 

gctgttaatg aagctgttct gatagaaaat ctggaaatat ttagaaagaa tggctttgat 2280 

tttgttatcg atgaaaatgc tccagtcact gaaagggcta aactgatttc cttgccaact 2340 

agtaaaaact ggaccttcgg accccaggac gtcgatgaac tgatcttcat gctgagcgac 2400 

agccctgggg tcatgtgccg gccttcccga gtcaagcaga tgtttgcctc cagagcctgc 24 50 

cggaagtcgg tgatgattgg gactgctctt aacacaagcg agatgaagaa actgatcacc 2520 

cacatggggg agatggacca cccctggaac tgtccccatg gaaggccaac catgagacac 2580 

atcgccaacc tgggtgtcat ttctcagaac tgaccgtagt cactgtatgg aataattggt 2640 

tttatcgcag atttttatgt tttgaaagac agagtcttca ctaacctttt ttgttttaaa 2700 

atgaaacctg ctacttaaaa aaaatacaca tcacacccat ttaaaagtga tcttgagaac 2760 

cttttcaaac c 2771 

<210> 5 

<211> 932 

<212> PRT 

<213> Homo sapiens 

<400> 5 

Met Lys Gin Leu Fro Ala Ala Thr Val Arg Leu Leu Ser Ser Ser Gin 


He He Thr Ser Val Val Ser Val Val Lys Glu Leu He Glu Asn Ser 
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Leu Asu Ala Gly Ala Thr Ser Val Asp Val Lys Leu Glu Asn Tyr Gly 
35 40 45 

Phe Asp Lys lie Glu Val Arg Asp Asn Gly Glu Gly lie Lys Ala Val 
50 55 60 

Asp Ala Pro Val Met Ala Met Lys Tyr Tyr Thr Ser Lys He Asn Ser 
65 10 ^° 

His Glu ASP Leu Glu Asn Leu Thr Thr Tyr Gly Phe Arg Gly Glu Ala 
85 90 95 

Leu Gly ser He Cys Cys He Ala Glu Val Leu He Thr Thr Arg Thr 
100 105 HO 

Ala Ala Asp Asn Phe Ser Thr Gin Tyr Val Leu Asp Gly Ser Gly His 


115 


120 125 


Ala Leu Arg Leu Phe Lys Asn Leu Pro Val Arg Lys Gin Phe Tyr Ser 
145 150 155 160 

Thr Ala Lys Lys Cys Lys Asp Glu He Lys Lys He Gin Asp Leu Leu 
165 170 1'= 

Met Ser Phe Gly He Leu Lys Pro Asp Leu Arg He Val Phe Val His 
180 185 190 

Asn Lys Ala Val He Trp Gin- Lys Ser Arg Val Ser Asp His Lys Met ' 


Ala Leu Met Ser Val Leu Gly Thr Ala Val Met Asn Asn Met Glu Ser 
210 215 220 

Phe Gin Tyr His Ser Glu Glu Ser Gin He Tyr Leu Ser Gly Phe Leu 
225 230 235 -^^^ 

Pro Lys Cys Asp Ala Asp His Ser Phe Thr Ser Leu Ser Thr Pro Glu 
245 250 

Arg Ser Phe He Phe He Asn Ser Arg Pro Val His Gin Lys Asp He 
260 265 270 

Leu Lys Leu He Arg His His Tyr Asn Leu Lys Cys Leu Lys Glu Ser 


Thr Arg Leu Tyr Pro Val Phe Phe Leu Lys He Asp Val Pro Thr Ala 
290 295 300 

ASP val Asp Val Asn Leu Thr Pro Asp Lys Ser Gin Val Leu Leu Gin 
305 310 315 320 
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PCT/US03/01361 


Gly Tyr Gly His Cys Ser Ser Glu lie Ser Asn lie Asp Lys Asn Thr 


Ser Gin Thr Glu Tyr Ser Lys Thr Cys Phe lie Ser Ser Val Lys Hi; 


Asn Glu Glu Glu Ala Gly Leu Glu Asn Ser Ser Glu lie Ser Ala Asp 


Ser Asn Asn Asn Tyr Pro He Pro Glu Gin Met Asn Leu Asn Glu Asp 


Asn Pro Lys Thr Ser Leu Glu Asp Ala Thr Leu Gin lie Glu Glu Leu 
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595 600 , 605 

Trp Lvs Thr Leu Ser Glu Glu Glu Lys Leu Lys Tyr Glu Glu Lys Ala 
610 615 620 

Thr Lys Asp Leu Glu Arg Tyr Asn Ser Gin Met Lys Arg Ala He Glu 
625 630 635 640 

Gin Glu Ser Gin Met Ser Leu Lys Asp Gly Arg Lys Lys He Lys Pro 
645 650 655 

Thr Ser Ala Trp Asn Leu Ala Gin Lys His Lys Leu Lys Thr Ser Leu 
660 665 670 

Ser Asn Gin Pro Lys Leu Asp Glu Leu Leu Gin Ser Gin He Glu Lys 
675 680 685 

Arg Arg Ser Gin Asn He Lys Met Val Gin He Pro Phe Ser Met Lys 
690 695 700 

Asn Leu Lys He Asn Phe Lys Lys Gin Asn Lys Val Asp Leu Glu Glu 
705 710 715 720 

Lys Asp Glu Pro Cys Leu He His Asn Leu Arg Phe Pro Asp Ala Trp 
725 730 735 

Leu Met Thr Ser Lys Thr Glu Val Met Leu Leu Asn Pro Tyr Arg Val 
740 745 750 

Glu Glu Ala Leu Leu Phe Lys Arg Leu Leu Glu Asn His Lys Leu Pro 
755 760 765 

Ala Glu Pro Leu Glu Lys Pro He Met Leu Thr Glu Ser Leu Phe Asn 
770 775 780 

Gly Ser His Tyr Leu Asp Val Leu Tyr Lys Met Thr Ala Asp Asp Gin 
785 790 795 800 

Arg Tyr Ser Gly Ser Thr Tyr Leu Ser Asp Pro Arg Leu Thr Ala Asn 
805 810 815 

Glv Phe Lys He Lys Leu He Pro Gly Val Ser He Thr Glu Asn Tyr 
' 820 825 830 

Leu Glu He Glu Gly Met Ala Asn Cys Leu Pro Phe Tyr Gly Val Ala 
835 840 845 

Aso Leu Lys Glu He Leu Asn Ala He Leu Asn Arg Asn Ala Lys Glu 
850 855 860 

Val Tyr Glu Cys Arg Pro Arg Lys Val He Ser Tyr Leu Glu Gly Glu 
865 870 875 880 
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Ala Val Arg Leu Ser Arg Gin Leu Pro Met Tyr Leu Ser Lys Glu Asp 
885 890 895 

He Gin Asp He He Tyr Arg Met Lys His Gin Phe Gly Asn Glu He 
900 905 910 

Lys Glu Cys Val His Gly Arg Pro Phe Phe His His Leu Thr Tyr Leu 
915 920 925 

Pro Glu Thr Thr 
930 

<210> 5 

<211> 3063 

<212> DNA 

<213> Homo sapiens 

<400> 6 

ggcacgagtg gctgcttgcg gctagtggat ggtaattgcc tgcctcgcgc tagcagcaag 50 

ctgctctgtt aaaagcgaaa atgaaacaat tgcctgcggc aacagttcga ctcctttcaa 120 

gttctcagat catcacttcg gtggtcagtg ttgtaaaaga gcttattgaa aactccttgg 180 

atgctggtgc cacaagcgta gatgttaaac tggagaacta tggatttgat aaaattgagg 240 

tgcgagataa cggggagggt atcaaggctg ttgatgcacc tgtaatggca atgaagtact 300 

acacctcaaa aataaatagt catgaagatc ttgaaaattt gacaacttac ggttttcgtg 360 

gagaagcctt ggggtcaatt tgttgtatag ctgaggtttt aattacaaca agaaoggctg 420 

ctgataattt tagcacccag tatgttttag atggcagtgg ccacatactt tctcagaaac 4 80 

cttcacatct tggtcaaggt acaactgtaa ctgctttaag attatttaag aatctacctg 540 

taagaaagca gttttactca actgcaaaaa aatgtaaaga tgaaataaaa aagatccaag SOO 

atctcctcat gagctttggt atccttaaac ctgacttaag gattgtcttt gtacataaca 560 

aggcagttat ttggcagaaa agcagagtat cagatcacaa gatggctctc atgtcagttc 720 

tggggactgc tgttatgaac aatatggaat cctttcagta ccactctgaa gaatctcaga 780 

tttatctcag tggatttctt ccaaagtgtg atgcagacca ctctttcact agtctttcaa 840 

oaccagaaag aagtttcatc ttcataaaca gtcgaccagt acatcaaaaa gatatcttaa 900 

agttaatccg acatcattac aatctgaaat gcctaaagga atctactcgt ttgtatcctg 960 

ttttctttct gaaaatcgat gttcctacag ctgatgttga tgtaaattta acaccagata 1020 

aaagccaagt attattacaa aataaggaat ctgttttaat tgctcttgaa aatctgatga 1080 

cgacttgtta tggaccatta cctagtacaa attcttatga aaataataaa acagatgttt 1140 

ccgcagctga catcgttctt agtaaaacag cagaaacaga tgtgcttttt aataaagtgg 1200 

aatcatctgg aaagaattat tcaaatgttg atacttcagt cattccattc caaaatgata 1250 

tgcataatga tgaatctgga aaaaacactg atgattgttt aaatcaccag ataagtattg 1320 

gtgactttgg ttatggtcat tgtagtagtg aaatttctaa cattgataaa aacactaaga 1380 

atgcatttca ggacatttca atgagtaatg tatcatggga gaactctcag acggaatata 1440 

gtaaaacttg ttttataagt tccgttaagc acacccagtc agaaaatggc aataaagacc 1500 
Page 14 


wo 03/062435 PC' 
atatagatga gagtggggaa aatgaggaag aagcaggtct tgaaaactct tcggaaattt 
ctgcagatga gtggagcagg ggaaatatac ttaaaaattc agtgggagag aatattgaac 
ctgtgaaaat tttagtgcct gaaaaaagtt taccatgtaa agtaagtaat aataattatc 
caatccctga acaaatgaat cttaatgaag attcatgtaa caaaaaatca aatgtaatag 
ataataaatc tggaaaagtt acagcttatg atttacttag caatcgagta atcaagaaac 
ccatgtcagc aagtgctctt tttgttcaag atcatcgtcc tcagtttctc atagaaaatc 
ctaagactag tttagaggat gcaacactac aaattgaaga actgtggaag acattgagtg 
aagaggaaaa actgaaatat gaagagaagg ctactaaaga cttggaacga tacaatagtc 
aaatgaagag agccattgaa caggagtcac aaatgtcact aaaagatggc agaaaaaaga 
taaaacccac cagcgcatgg aatttggccc agaagcacaa gttaaaaacc tcattatcta 
atcaaccaaa acttgatgaa ctccttcagt cccaaattga aaaaagaagg agtcaaaata 
ttaaaatggt acagatcccc ttttctatga aaaacttaaa aataaatttt aagaaacaaa 
acaaagttga cttagaagag aaggatgaac cttgcttgat ccacaatctc aggtttcctg 
atgcatggct aatgacatcc aaaacagagg taatgttatt aaatccatat agagtagaag 
aagccctgct atttaaaaga cttcttgaga atcataaact tcctgcagag ccactggaaa 
agccaattat gttaacagag agtcttttta atggatctca ttatttagac gttttatata 
aaatgacagc agatgaccaa agatacagtg gatcaactta cctgtctgat cctcgtctta 
cagcgaatgg tttcaagata aaattgatac caggagtttc aattactgaa aattacttgg 
aaatagaagg aatggctaat tgtctcccat tctatggagt agcagattta aaagaaattc 
ttaatgctat attaaacaga aatgcaaagg aagtttatga atgtagacct cgcaaagtga 
taagttattt agagggagaa gcagtgcgtc tatccagaca attacccatg tacttatcaa 
aagaggacat ccaagacatt atctacagaa tgaagcacca gtttggaaat gaaattaaag 
agtgtgttca tggtcgccca ttttttcatc atttaaccta tcttccagaa actacatgat 
taaatatgtt taagaagatt agttaccatt gaaattggtt otgtcataaa acagcatgag 
tctggtttta aattatcttt gtattatgtg tcacatggtt attttttaaa tgaggattca 
ctgacttgtt tttatattga aaaaagttcc acgtattgta gaaaacgtaa ataaactaat 
aac 


1560 

1620 

1680 

1740 

IBOO 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

27 60 

2820 

2880 

2940 

3000 

3060 

3063 


<210> 7 

<211> 934 

<212> PRT 

<213> Homo sapiens 

<400> 7 

Met Ala Val Gin Pro Lys Glu Thr Leu Gin Leu Glu Ser Ala Ala Glu 


Val Gly Phe Val Arg Phe Phe Gin Gly Met Pro Glu Lys Pro Thr Thr 
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Thr Val Arg Leu Phe Asp Arg Gly Asp Phe Tyr Thr Ala His Gly Glu 


Asp Ala Leu Leu Ala Ala Arg Glu Val Phe Lys Thr Gin Gly Val lie 


Lys Tyr Met Gly Pro Ala Gly Ala Lys Asn Leu Gin Ser Val Val Leu 


Ser Lys Met Asn Phe Glu Sex Phe Val Lys Asp Leu Leu Leu Val Arg 
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Thr Gly Ser Gin Ser Leu Ala Ala Leu Leu Asn Lys Cys Lys Thr Pro 
325 330 

Gin Gly Gin Arg Leu Val Asn Gin Trp He Lys Gin Pro Leu Met Asp 
340 345 

Lys Asn Arg He Glu Glu Arg Leu Asn Leu Val Glu Ala Phe Val Glu 
355 360 365 

Asp Ala Glu Leu Arg Gin Thr Leu Gin Glu Asp Leu leu Arg Arg Phe 
370 a'^S 380 

I Leu Ala Lys Lys Phe Gin Arg Gin Ala Ala Asn 
390 395 400 

Leu Gin Asp Cys Tyr Arg Leu Tyr Gin Gly He Asn Gin Leu Pro Asn 
405 410 '^■^=> 

val He Gin Ala Leu Glu Lys His Glu Gly Lys His Gin Lys Leu Leu 
420 425 430 

Leu Ala Val Phe Val Thr Pro Leu Thr Asp Leu Arg Ser Asp Phe Ser 
435 440 445 

I Phe Gin Glu Met He Glu Thr Thr Leu Asp Met Asp Gin Val Glu 


Asn His Glu Phe Leu Val Lys Pro Ser Phe Asp Pro Asn Leu Ser Glu 
465 470 475 480 

Leu Arg Glu He Met Asn Asp Leu Glu Lys Lys Met Gin Ser Thr Leu 
/los 490 49b 


Leu ASP ser Ser Ala Gin Phe Gly Tyr Tyr Phe Arg Val Thr Cys Lys 
515 520 

Glu Glu Lys Val Leu Arg Asn Asn Lys Asn Phe Ser Thr Val Asp He 


Gin Lys Asn Gly Val Lys Phe Thr Asn Ser Lys Leu Thr Ser Leu Asn 
545 550 555 560 

Glu Glu Tyr Thr Lys Asn Lys Thr Glu Tyr Glu Glu Ala Gin Asp Ala 
565 570 ='= 

He val Lys Glu He Val Asn He Ser Ser Gly Tyr Val Glu Pro Met 


Gin Thr Leu Asn Asp Val Leu Ala Gin Leu Asp Ala Val Val Ser Phe 
595 600 505 
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His Val Thr Ala Leu Thr Thr Glu Glu Thr Leu Thr Met Leu Tyr Gin 


Leu Ala Asn Phe Pro Lys His Val lie Glu Cys Ala Lys Gin Lys Ala 


Glu Lys lie lie Gin Glu Phe Leu Ser Lys Val Lys Gin Met Pro Phe 
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885 


890 
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895 


Thr Glu Met Ser Glu Glu Asn He Thr He Lys Leu Lys Gin Leu Lys 
900 Siu 

Ala Glu Val He Ala Lys Asn Asn Ser Phe Val Asn Glu He He Ser 
915 920 


<210> 8 

<211> 3145 

<212> DNA 

<213> Homo sapiens 

ggjgjgaaac agcttagtgg gtgtggggtc gcgcattttc ttcaaocagg aggtgaggag 


120 
180 
240 


gtttcgacat ggcggtgcag ccgaaggaga cgctgcagtt ggagagcgcg gccgaggtcg 
gcttcgtgog cttctttcag ggcatgccgg agaagccgac caccacagtg cgccttttcg 
accggggcga cttctatacg gcgcacggcg aggacgcgct gctggccgcc cgggaggtgt 
tcaagaccca gggggtgatc aagtacatgg ggccggcagg agcaaagaat ctgcagagtg 
ttgtgcttag taaaatgaat tttgaatctt ttgtaaaaga tcttcttctg gttcgtcagt 
atagagttga agtttataag aatagagctg gaaataaggc atccaaggag aatgattggt 
atttggcata taaggcttct cctggcaatc tctctcagtt tgaagacatt ctctttggta 
acaatgatat gtcagcttcc attggtgttg tgggtgttaa aatgtccgca gttgatggcc 
agagacaggt tggagttggg tatgtggatt ccatacagag gaaactagga ctgtgtgaat 
tccctgataa tgatcagttc tccaatcttg aggctctcct catccagatt ggaccaaagg 
aatgtgtttt acccggagga gagactgctg gagacatggg gaaactgaga cagataattc 
aaagaggagg aattctgatc acagaaagaa aaaaagctga cttttccaca aaagacattt 
atcaggacct caaccggttg ttgaaaggca aaaagggaga gcagatgaat agtgctgtat 
tgccagaaat ggagaatcag gttgcagttt catcactgtc tgcggtaatc aagtttttag 
aactcttatc agatgattcc aactttggac agtttgaaot gactactttt gacttcagcc 
agtatatgaa attggatatt gcagcagtca gagcccttaa cctttttcag ggttctgttg 
aagatacoac tggctctcag tctctggctg ccttgctgaa taagtgtaaa acccctcaag 
gacaaagact tgttaaccag tggattaagc agcctctcat ggataagaac agaatagagg 
agagattgaa tttagtggaa gcttttgtag aagatgcaga attgaggcag actttacaag 
aagatttact tcgtcgattc ocagatctta accgacttgc caagaagttt caaagacaag 
cagcaaactt acaagattgt taccgactct atcagggtat aaatcaacta cctaatgtta 
tacaggctct ggaaaaacat gaaggaaaao accagaaatt attgttggca gtttttgtga 
otcctcttac tgatcttcgt tctgacttct ccaagtttca ggaaatgata gaaacaactt 1440 
tagatatgga tcaggtggaa aaccatgaat tccttgtaaa accttcattt gatcctaatc 
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360 

420 

480 

540 

600 

660 

720 
780 
840 
900 
960 

1020 

1080 

1140 

1200 
1260 
1320 
1380 


1500 
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tcagtgaatt 
gtgcagccag 
agtttggata 
actttagtac 
ctttaaatga 
ttaaagaaat 
tgttagctca 
catatgtacg 
ggcatgcttg 
aaaaagataa 
atattcgaca 
agtcagcaga 
aattgaaagg 
ctgcaaccaa 
atggatttgg 
gcatgtttgc 
ataatctaca 
agaaaggtgt 
agcatgtaat 
gagaatcgca 
agcaaggtga 
aaatgtcaga 
agaataatag 
cagtaatgga 
atattaaccc 

gctgtaactg 
ataaataaaa 


PCT/US03/01361 


aagagaaata 
agatcttggc 
ttactttcgt 
tgtagatatc 
agagtatacc 
tgtcaatatt 
gctagatgct 
accagccatt 
tgttgaagtt 
acagatgttc 
aactggggtg 
agtgtccatt 
agtctccacg 
agattcatta 
gttagcatgg 
aacccatttt 
tgtcacagca 
ctgtgatcaa 
agagtgtgct 
aggatatgat 
aaaaattatt 
agaaaacatc 
ctttgtaaat 
atgaaggtaa 
tttttccata 
atattttact 
aggactgttt 
tcatgtagtt 


atgaatgact 
ttggaccctg 
gtaacctgta 
cagaagaatg 
aaaaataaaa 
tcttcaggct 
gttgtcagct 
ttggagaaag 
caagatgaaa 
cacatcatta 
atagtactca 
gtggactgca 
ttcatggctg 
ataatcatag 
gctatatcag 
catgaactta 
ctcaccactg 
agttttggga 
aaacagaaag 
atcatggaac 
caggagttcc 
acaataaagt 
gaaatcattt 
tattgataag 
gtgttaactg 
ttgaggacat 
gcaattgaca 
tgtgg 


tggaaaagaa 
gcaaacagat 
aggaagaaaa 
gtgttaaatt 
cagaatatga 
atgtagaacc 
ttgctcacgt 
gacaaggaag 
ttgcatttat 
ctggccccaa 
tggcccaaat 
tcttagcccg 
aaatgttgga 
atgaattggg 
aatacattgc 
ctgccttggc 
aagagacctt 
ttcatgttgc 
ccctggaact 
cagcagcaaa 
tgtccaaggt 
taaaacagct 
cacgaataaa 
ctattgtctg 
tcagtgccca 
tttcaaagat 
taggcaataa 


gatgcagtca 
taaactggat 
agtccttcgt 
taccaacagc 
agaagcccag 
aatgcagaca 
gtcaaatgga 
aattatatta 
tcctaatgac 
tatgggaggt 
tgggtgtttt 
agtaggggct 
aactgcttct 
aagaggaact 
aacaaagatt 
caatcagata 
aactatgctt 
agagcttgct 
tgaggagttt 
gaagtgctat 
gaaacaaatg 
aaaagctgaa 


:acg 


taatagtttt 
tgggctatca 
ttttattttg 
taagtgatgt 


acattaataa 
tccagtgcac 
aacaataaaa 
aaattgactt 
gatgccattg 
ctcaatgatg 
gcacctgttc 
aaagcatcca 
gtatactttg 
aaatcaacat 
gtgccatgtg 
ggtgacagtc 
atcctcaggt 
tctacctacg 
ggtgcttttt 
ccaactgtta 
tatcaggtga 
aatttcccta 
cagtatattg 
ctggaaagag 
ccctttactg 
gtaatagcaa 
tgaaaaatcc 
atattgtttt 
acttaataag 
aaaaatgaga 
gctgaatttt 


1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2750 
2820 
2880 
2940 
3000 
3060 
3120 
3145 


<210> 9 

<211> 756 

<212> PRT 

<213> Homo sapiens 


Met Ser Phe Val Ala Gly Val He Arg Arg Leu Asp Glu Thr Val Val 


Asn Arg lie Ala Ala Gly Glu Val lie Gin Arg Pro Ala Asn Ala He 
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I,ys Glu Met He Glu Asn Cys Leu Asp Ma Lys Ser Thr Ser lie Gin 
35 ^° 

Val He Val Lys Glu Gly Gly Leu Lys Leu He Gin He Gin Asp Asn 
50 

Gly Th. Gly He Arg Lys Glu Asp Leu Asp lie Val Cys Glu Axg Phe 

Thr Thr ser Lys Leu Gin Ser Phe Glu Asp Leu Ala Ser He Ser Thr 

Tyr Gly Phe Arg Gly Glu Ala Leu Ala Ser He Ser His Val Ala His 
100 

val Thr lie Thr Thr Lys Thr Ala Asp Gly Lys Cys Ala Tyr Arg Ala 
ser Tyr Ser Asp Gly Lys Leu Lys Ala Pro Pro Lys Pro Cys Ala Gly 


Asn Gin Gly Thr Gin He 


145 


150 


Thr Val Glu Asp Leu Phe Tyr Asn He 2 


Arg Arg Lys Ala Leu Lys Asn Pro Ser Glu Glu Tyr Gly Lys He 


Thr.._. 


Leu Glu Val Val Gly Arg Tyr Ser 
180 


Val His Asn Ala Gly He Ser Phe 


ser Val Lys Lys Gin Gly Glu Ihr 


Val Ala Asp Val Arg Thr Leu Pro 


Lys Met Asn 

245 


Asn Ala Ser Thr Val Asp Asn He Arg Ser He Phe Gly Asn Ala Val 
210 215 

ser Arg Glu Leu He Glu He Gly Cys Glu Asp Lys Thr Leu Ala Phe 

Gly Tyr He Ser Asn Ala Asn Tyr Ser Val Lys Lys Cys 

XXe Phe Leu Leu Phe Xle Asn His Arg Leu Val Glu Ser Thr Ser Leu 
260 265 

Arg Lys Ala He Glu Thr Val Tyr Ala Ala Tyr Leu Pro Lys Asn Thr 


His pro Phe Leu Tyr Leu Ser Leu Glu He Ser Pro Gin Asn Val Asp 


val Asn Val His Pro Thr Lys 


His Glu val His Phe Leu His Glu Glu 
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Ser lie Leu Glu Arg Val Gin Gin His He Glu Ser Lys Leu Leu Gly 


Arg Thr Asp Ser Arg Glu Gin Lys Leu Asp Ala Phe Leu Gin Pro Leu 


Ser Lys Pro Leu Ser Ser Gin Pro Gin Ala He Val Thr Glu Asp Lys 


Glu Leu Pro Ala Pro Ala Glu Val Ala Ala Lys Asn Gin Ser Leu Glu 


: Asn Pro Arg Lys Arg His Arg Glu Asp Ser Asp Val Glu 


Tyr Gin He Leu He Tyr Asp Phe Ala Asn Phe Gly Val Leu Arg Leu 


Pro Glu Ser Gly Trp Thr Glu Glu Asp Gly Pro Lys Glu Gly Leu Ala 
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595 


PCT/US03/01361 


Glu Tyr He Val Glu Phe Leu Lys Lys Lys Ala Glu Met Leu Ala Asp 

610 615 620 

Tvr Phe Ser Leu Glu He Asp Glu Glu Gly Asn Leu He Gly Leu Pro 

625 630 635 640 


He Leu Arg Leu Ala Thr Glu Val Asn Trp Asp Glu Glu Lys Glu Cys 
660 665 670 


Phe Glu Ser Leu 


Ser Lys Glu Cys Ala Met Phe Tyr Ser He Arg Lys 


Gin Tyr He Ser Glu Glu Ser Thr Leu Ser Gly Gin Gin Ser Glu Val 
690 695 700 

Pro Gly Ser He Pro Asn Ser Trp Lys Trp Thr Val Glu His He Val 
705 710 715 720 

Tyr Lys Ala Leu Arg Ser His He Leu Pro Pro Lys His Phe Thr Glu 
725 730 735 

Asp Gly Asn He Leu Gin Leu Ala Asn Leu Pro Asp Leu Tyr Lys Val 
740 745 750 


<210> 10 

<211> 2484 

<212> DNA 

<213> Homo sapiens 

<400> 10 

cttggctctt ctggcgccaa aatgtcgttc gtggcagggg ttattcggcg gctggacgag 
acagtggtga accgcatcgc ggcgggggaa gttatccagc ggccagctaa tgctatcaaa 
gagatgattg agaactgttt agatgcaaaa tccacaagta ttcaagtgat tgttaaagag 
ggaggcctga agttgattca gatccaagac aatggcaccg ggatcaggaa agaagatctg 
gatattgtat gtgaaaggtt cactactagt aaactgcagt cctttgagga tttagccagt 
atttctacct atggctttcg aggtgaggct ttggccagca taagccatgt ggctcatgtt 
actattacaa cgaaaacagc tgatggaaag tgtgcataca gagcaagtta ctcagatgga 
aaactgaaag cccctcctaa accatgtgct ggcaatcaag ggacccagat cacggtggag 
gacctttttt acaacatagc cacgaggaga aaagctttaa aaaatccaag tgaagaatat 
gggaaaattt tggaagttgt tggcaggtat tcagtacaca atgcaggcat tagtttctca 
gttaaaaaac aaggagagac agtagctgat gttaggacac tacccaatgc ctcaaccgtg 
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gacaatattc gctccatctt tggaaatgct gttagtcgag aactgataga aattggatgt 720 

gaggataaaa ccctagcctt caaaatgaat ggttacatat ccaatgcaaa ctactcagtg 780 

aagaagtgca tcttcttact cttcatcaac catcgtctgg tagaatcaac ttccttgaga 840 

aaagccatag aaacagtgta tgcagcctat ttgcccaaaa acacacaccc attcctgtac 900 

ctcagtttag aaatcagtcc ccagaatgtg gatgttaatg tgcaccccac aaagcatgaa 960 

gttcacttcc tgcacgagga gagcatcctg gagcgggtgc agcagcacat cgagagcaag 1020 

ctcctgggct ccaattcctc caggatgtac ttcacccaga ctttgctacc aggacttgct 1080 

ggcccctctg gggagatggt taaatccaca acaagtctga cctcgtottc tacttctgga 1140 

agtagtgata aggtctatgc ccaccagatg gttcgtacag attcccggga acagaagctt 1200 

gatgcatttc tgcagcctct gagcaaaccc ctgtccagtc agccccaggc cattgtcaca 1260 

gaggataaga cagatatttc tagtggcagg gctaggcagc aagatgagga gatgcttgaa 1320 

ctcccagccc ctgctgaagt ggctgccaaa aatcagagct tggaggggga tacaacaaag 1380 

gggacttcag aaatgtcaga gaagagagga cctacttcca gcaaccccag aaagagacat 1440 

cgggaagatt ctgatgtgga aatggtggaa gatgattccc gaaaggaaat gactgcagct 1500 

tgtacccccc ggagaaggat cattaacctc actagtgttt tgagtctcca ggaagaaatt 15 60 

aatgagcagg gacatgaggt tctccgggag atgttgcata accactcctt cgtgggctgt 1620 

gtgaatcctc agtgggcctt ggcacagcat caaaccaagt tataccttct caacaccacc 1580 

aagcttagtg aagaactgtt ctaccagata ctcatttatg attttgccaa ttttggtgtt 1740 

ctcaggttat cggagccagc accgctcttt gaccttgcca tgcttgcctt agatagtcca 1800 

gagagtggct ggacagagga agatggtccc aaagaaggac ttgctgaata cattgttgag 18 50 

tttctgaaga agaaggctga gatgcttgca gactatttct ctttggaaat tgatgaggaa 1920 

gggaacctga ttggattacc ccttctgatt gacaactatg tgcccccttt ggagggactg 1980 

cctatcttca ttcttcgact agccactgag gtgaattggg acgaagaaaa ggaatgtttt 2040 

gaaagcctca gtaaagaatg cgctatgttc tattccatcc ggaagcagta catatctgag 2100 

gagtcgaccc tctcaggcca gcagagtgaa gtgcctggct ccattccaaa ctcctggaag 2160 

tggactgtgg aacacattgt ctataaagcc ttgcgctcac acattctgcc tcctaaacat 2220 

ttcacagaag atggaaatat cctgcagctt gctaacctgc ctgatctata caaagtcttt 2280 

gagaggtgtt aaatatggtt atttatgcac tgtgggatgt gttcttcttt ctctgtattc 2340 

cgatacaaag tgttgtatca aagtgtgata tacaaagtgt accaacataa gtgttggtag 2400 

cacttaagac ttatacttgc cttctgatag tattccttta tacacagtgg attgattata 24 60 

aataaataga tgtgtcttaa cata 2484 

<210> 11 
<211> 133 
<212> PRT 
<213> Homo sapiens 

<400> 11 

Met Lys Gin Leu Pro Ala Ala Thr Val Arg Leu Leu Ser Ser Ser Gin 
Page 24 


wo 03/062435 PCT/US03/01361 
3. 5 10 15 

IlB lie Thr Ser Val Val Ssr Val Val Lys Glu Leu He Glu Asix Ser 
20 25 

Leu ASP Ala Gly Ala Thr Ser Val Asp Val Lys Leu Glu Asn Tyr Gly 
35 40 45 

Phe ASP Lys He Glu Val Arg Asp Asn Gly Glu Gly He Lys Ala Val 

Asp Ala pro Val Met Ala Met Lys Tyr Tyr Thr Ser Lys He Asn Ser 

His Glu ASP Leu Glu Asn Leu Thr Thr Tyr Gly Phe Arg Gly Glu Ala 

Leu Gly ser He Cys Cys He Ala Glu Val Leu He Thr Thr Arg Thr 
100 

Ala Ala Asp Asn Phe Ser Thr Gin Tyr Val Leu Asp Gly Ser Gly His 
lie 120 ■'■^^ 


120 

He Leu Ser Gin Lys 


<210> 12 

<211> 426 

<212> DNA 

<213> Homo sapiens 


cgaggcggat cgggtgttgc atccatggag cgagctgaga gctcgagtac agaacctgct 
aaggccatca aacctattga tcggaagtca gtccatcaga tttgctctgg gcaggtggta 
ctgagtctaa gcactgcggt aaaggagtta gtagaaaaca gtctggatgc tggtgccact 
aatattgatc taaagottaa ggactatgga gtggatctta ttgaagtttc agacaatgga 
tgtggggtag aagaagaaaa cttcgaaggc ttaactctga aacatcacac atctaagatt 
caagagtttg cogaoctaac tcaggttgaa acttttggct ttcgggggga agctctgago 
tcactttgtg cactgagcga tgtcaccatt tctacctgcc acgcatcggc gaaggttgga 
acttga 

<210> 13 
<211> 100 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Oligonucleotide Primer 

ttcagagtag aaaactaaat atgatgaata actaaaaata atttctcaaa tttttttctg 
atggttcctt ctgcttcatc cccgtggccc gttgctcgcg 
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<210> 14 
<211> 100 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Oligonucleotide Primer 
<400> 14 

gccccagttg gaccctgagg tcgtactcac cccaacagct cagcgccccc tctccagcgc 60 
cgccataagc tacccagctt ctagagatct gacggttcac 100 


<210> 15 
<211> 100 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer ' 
<400> 15 

tgtgtgtgtg ttgtggtcag tggggctgga ataaaagtag aatagacctg cacctgctgt . 60 
ggcatccatt ctgcttcatc cccgtggccc gttgctcgcg 100 


<210> 16 
<211> 100 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 16 

tcaggagtca ggtgcaccat ggtgtctgtt tgaggttgct agtgaacaca gttgtgtcag 60 
aagcaaatgt tacccagctt ctagagatct gacggttcac 100 


<210> n 
<211> 100 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 17 

gttctctgga cgtaattttt cttgagcaga gcaacagtag agctttgtat gcaacaatgt 60 
aatttttaca ctgcttcatc cccgtggccc gttgctcgcg 100 


<210> 18 
<211> 100 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 18 

atcaggtcca aaggacttaa ctgatctttc tcttctaata gctgatcttc agatgatcag 60 
aacaatgtgc tacccagctt ctagagatct gacggttcac 100 
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<210> 19 
<211> 100 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 19 

tggcaggagc ggaagcaaga gagggaaggg aggaggtgcc acacactttc aaacaaccag 
atcttcagac ctgcttcatc cccgtggccc gttgctcgcg 

<210> 20 
<211> 100 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 

gctcggctga aatccgcgcc ccttagaagt cacggtgcgc gagcagagac tggacggatt 
ctagcgggat tacccagctt ctagagatct gacggttcac 


<210> 22 

<211> 25 

<212> DNA 

<213> Artificial Sequence 


<210> 23 

<211> 23 

<212> DNA 

<213> Artificial Sequence 


<400> 23 

cattcggtac tggcgtattt etc 


<210> 24 

<211> 24 

<212> DNA 

<213> Artificial Sequence 


Oligonucleotide Primer 
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<210> 25 

<211> 24 

<212> DNA 

<213> Artificial Secpaence 


<210> 26 

<211> 24 

<212> DNA 

<213> artificial Secpience 


<210> 27 

<211> 24 

<212> DNA 

<213> Artificial Sequence 


<210> 28 

<211> 26 

<212> DNA 

<213> Artificial Sequence 


<210> 29 

<21X> 23 

<212> DNA 

<213> Artificial Sequence 


Oligonucleotide Primer 


<210> 30 

<211> 30 

<212> DNA 

<213> Artificial Sequence 
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<223> Oligonucleotide Primer 

<400> 30 . , 30 

atgaaaaagc ctgaactcac cgcgacgtct 

<210> 31 

<211> 30 

<212> DNA 

<213> Artificial Sequence 

<220> . „ . 

<223> Oligonucleotide Primer 

<400> 31 ^ X. 4. 4- 30 

tttatataat tcatccatac catgtgtgtg 

<210> 32 

<211> 30 

<212> DNA 

<213> artificial Sequence 

<220> . I 

<223> Oligonucleotide Prmer 

<400> 32 ^ ^ 30 

atgaaaaagc ctgaactcac cgcgacgtct 


<212> DNA 

<213> Artificial 
<220> 

<223> Oligonucleotide Prmer 
<400> 33 

caatttggac tttccgccct tcttggcctt 

<210> 34 
<211> 550 
<212> PRT 

<213> Photinus pyralis 
<400> 34 

Met Glu A3P Ala Lys Asn He Lys Lys Gly Pro Ala Pro Phe Tyr Pro 

Leu Glu ASP Gly Thr Ala Gly Glu Gin Leu Hi. Lys Ala Met Lys Arg 

20 2b 

Tyr Ala Leu Val Pro Gly Thr lie Ala Phe Thr Asp Ala His He Glu 

Val Asn lie Thr Tyr Ala Glu Tyr Phe Glu Met Ser Val Arg Leu Ala 
50 

Glu Ala Met Lys Arg Tyr Gly Leu Asn Thr Aan His Arg He Val Val 
65 ''0 

Cys ser Glu Asn Ser Leu Gin Phe Phe Met Pro Val Leu Gly Ala Leu 
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Glu Leu Leu Asn Ser Met Asn He Ser Gin Pro Thr Val Val Phe Val 
115 120 V 125 


lie He Gin Lys He He He Met Asp Ser Lys Thr Asp Tyr Gin Gly 
145 150 155 160 


Val Pro Phe His His Gly Phe Gly Met Phe Thr Thr Leu Gly Tyr Leu 
245 250 255 

He Cys Gly Phe Arg Val Val Leu Met Tyr Arg Phe Glu Glu Glu Leu 


Phe Glu Ala Lys Val Val Asp Leu Asp Thr Gly Lys Thr Leu Gly Val 
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370 375 380 

Asn Gin Arg Gly Glu Leu Cys Val Arg Gly Pro Met He Met Ser Gly 
385 390 

Tyr val Asn Asn Pro Glu Ala Thr Asn Ala Leu lie Asp Lys Asp Gly 
405 410 

Trp Leu His Ser Gly Asp He Ala Tyr Trp Asp Glu Aap Glu His Phe 
420 


val Ala Pro Ala Glu Leu Glu Ser He Leu Leu Gin His Pro Asn He 
450 

Phe ASP Ala Gly val Ala Gly Leu Pro Asp Asp Asp Ala Gly Glu Leu 

Pro Ala Ala Val Val Val Leu Glu His Gly Lys Thr Met Thr Glu Lys 

Glu lie val Asp Tyr Val Ala Ser Gin Val Thr Thr Ala Lys Lys Leu 
500 505 ^J-U 

Arg Gly Gly Val val Phe Val Aap Glu Val Pro Lys Gly Leu Thr Gly 
Lys Leu ASP Ala Arg Lys lie Arg Glu He Leu He Lys Ala Lys Lys 

Gly Gly Lys Ser Lys Leu 
545 550 

<210> 35 
<211> 2387 
<212> DNA 

<213> Photinus pyralis 

cScagaaat aactaggtac taagcocgtt tgtgaaaagt ggccaaaccc ataaatttgg 
caattacaat aaagaagcta aaattgtggt caaactcaca aacattttta ttatatacat 
tttagtagct gatgcttata aaagoaatat ttaaatcgta aacaacaaat aaaataaaat 
ttaaacgatg tgattaagag ccaaaggtcc tctagaaaaa ggtatttaag caaoggaatt 
cctttgtgtt acattcttga atgtcgctcg cagtgacatt agoattccgg tactgttggt 
aaaatggaag aogccaaaaa cataaagaaa ggcccg^cgc cattctatcc tctagaggat 
ggaaccgctg gagagcaact gcataaggct atgaagagat acgccctggt tcctggaaca 
attgcttttg tgagtatttc tgtctgattt ctttcgagtt aacgaaatgt tcttatgttt 
ctttagacag atgcacatat cgaggtgaao atcacgtacg cggaatactt cgaaatgtcc 
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gttcggttgg cagaagctat gaaacgatat gggctgaata caaatcacag aatcgtcgta 600 

tgcagtgaaa actctcttca attctttatg ccggtgttgg gcgcgttatt tatcggagtt 660 

gcagttgcgc ccgcgaacga catttataat gaacgtaagc accctcgcca tcagaccaaa 720 

gggaatgacg tatttaattt ttaaggtgaa ttgctcaaca gtatgaacat ttcgcagcct 780 

accgtagtgt ttgtttcoaa aaaggggttg caaaaaattt tgaacgtgca aaaaaaatta 840 

ccaataatcc agaaaattat tatcatggat tctaaaacgg attaccaggg atttcagtcg 900 

atgtacacgt tcgtcacatc tcatctacct cccggtttta atgaatacga ttttgtacca 960 

gagtcctttg atcgtgacaa aacaattgca ctgataatga attcctctgg atctactggg 1020 

ttacctaagg gtgtggccct tccgcataga actgcctgcg tcagattctc gcatgccagg 1080 

tatgtcg'tat aacaagagat taagtaatgt tgctacacac attgtagaga tcctattttt 1140 

ggcaatcaaa tcattccgga tactgcgatt ttaagtgttg ttccattcca tcacggtttt 1200 

ggaatgttta ctacactcgg atatttgata tgtggatttc gagtcgtctt aatgtataga 1260 

tttgaagaag agctgttttt acgatccctt caggattaca aaattcaaag tgcgttgcta 1320 

gtaccaaccc tattttcatt cttcgccaaa agcactctga ttgacaaata cgatttatct 1380 

aatttacacg aaattgcttc tgggggcgca cctctttcga aagaagtcgg ggaagcggtt 1440 

gcaaaacggt gagttaagcg cattgctagt atttcaaggc tctaaaacgg cgcgtagctt 1500 

ccatcttcca gggatacgac aaggatatgg gctcactgag actacatcag ctattctgat 1560 

tacacccgag ggggatgata aaccgggcgc ggtoggtaaa gttgttccat tttttgaagc 1620 

gaaggttgtg gatctggata ccgggaaaac gctgggcgtt aatcagagag gcgaattatg 1680 

tgtcagagga cctatgatta tgtccggtta tgtaaacaat ccggaagcga ccaacgcctt 1740 . 

gattgacaag gatggatggc tacattctgg agacatagct tactgggacg aagacgaaca 1800 

cttcttcata gttgaccgct tgaagtcttt aattaaatac aaaggatatc aggtaatgaa 1860 

gatttttaca tgcacacacg ctacaatacc tgtaggtggc ccccgctgaa ttggaatcga 1920 

tattgttaca acaccccaac atcttcgacg cgggcgtggc aggtcttccc gacgatgacg 19B0 

ccggtgaact tcccgccgcc gttgttgttt tggagcacgg aaagacgatg acggaaaaag 2040 

agatcgtgga ttacgtcgcc agtaaatgaa ttcgttttac gttactcgta ctacaattct 2100 

tttcataggt caagtaacaa ccgcgaaaaa gttgcgcgga ggagttgtgt ttgtggacga 2160 

agtaccgaaa ggtottaccg gaaaactcga cgcaagaaaa atcagagaga tcctcataaa 2220 

ggccaagaag ggcggaaagt ccaaattgta aaatgtaact gtattcagcg atgacgaaat 2280 

tcttagctat tgtaatatta tatgcaaatt gatgaatggt aattttgtaa ttgtgggtca 2340 

ctgtactatt ttaacgaata ataaaatcag gtataggtaa ctaaaaa 2387 

<210> 36 
<211> 238 
<212> PRT 

<213> Aequorea victoria 
<400> 36 

Met Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro He Leu Val 
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Glu Leu Asp Gly Asp Val Asn Gly Gin Lys Phe Ser Val Ser Gly Glu 
20 25 30 

Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe lie Cys 
35 40 45 

Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe 
50 55 60 

Ser Tyr Gly Val Gin Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gin 

His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu Arg 
85 90 

Thr lie Phe Tyr Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val 
100 105 110 

Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu Lys Gly He 


Asp Phe Lys Glu Asp Gly Asn He Leu Gly His Lys Met Glu Tyr Asn 
130 135 140 


145 150 
He Lys Val Asn Phe Lys He Arg His Asn He Lys Asp Gly Ser Val 


Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gin Ser Ala Leu Ser 
195 200 205 

Lys Asp Pro Asn Glu Lys Arg Asp His Met He Leu Leu Glu Phe Val 
210 215 220 

Thr Ala Ala Gly He Thr His Gly Met Asp Glu Leu Tyr Lys 
225 230 235 

<210> 37 
<211> 922 
<212> DNA 

<213> Aequorea victoria 

ta^acacgla taaaagataa caaagatgag taaaggagaa gaacttttca ctggagttgt 60 
cccaattctt gttgaattag atggcgatgt taatgggcaa aaattctctg tcagtggaga 120 
gggtgaaggt gatgcaacat acggaaaact tacccttaaa tttatttgca ctactgggaa 180 
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gctacctgtt ccatggccaa cacttgtcac tactttctct tatggtgttc aatgcttttc 240 

aagataccca gatcatatga aacagcatga ctttttcaag agtgccatgc ccgaaggtta 300 

tgtacaggaa agaactatat tttacaaaga tgacgggaac tacaagacac gtgctgaagt 360 

caagtttgaa ggtgataccc ttgttaatag aatcgagtta aaaggtattg attttaaaga 420 

agatggaaac attcttggac acaaaatgga atacaactat aactcacata atgtatacat 480 

catggcagac aaaccaaaga atggaatcaa agttaacttc aaaattagac acaacattaa 540 

agatggaagc gttcaattag cagaccatta tcaacaaaat actccaattg gcgatggccc 600 

tgtcctttta ccagacaacc attacctgtc cacacaatct gccctttcca aagatcccaa 660 

cgaaaagaga gatcacatga tccttcttga gtttgtaaca gctgctggga ttacacatgg 720 

catggatgaa ctatacaaat aaatgtccag acttccaatt gacactaaag tgtccgaaca 780 

attactaaat tctcagggtt cctggttaaa ttcaggctga gactttattt atatatttat 840 

agattcatta aaattttatg aataatttat tgatgttatt aataggggct attttcttat 900 

taaataggct actggagtgt at 922 

<210> 38 
<211> 311 
<212> PRT 

<213> Renilla reniformis 
<400> 38 

Met Thr Ser Lys Val Tyr Asp Pro Glu Gin Arg Lys Arg Met lie Thr 
15 10 15 

Gly Pro Gin Trp Trp Ala Arg Cys Lys Gin Met Asn Val Leu Asp Ser 
20 25 30 

Phe He Asn Tyr Tyr Asp Ser Glu Lys His Ala Glu Asn Ala Val He 
35 40 45 

Phe Leu His Gly Asn Ala Ala Ser Ser Tyr Leu Trp Arg His Val Val 
50 55 60 

Pro His He Glu Pro Val Ala Arg Cys He He Pro Asp Leu He Gly 


Met Gly Lys Ser Gly Lys Ser Gly Asn Gly Ser Tyr Arg Leu Leu Asp 
85 90 95 


Ser Val Val Asp Val He Glu Ser Trp Asp Glu Trp Pro Asp He Glu 
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145 150 155 160 


Glu Asn Asn Phe Phe Val Glu Thr Met Leu Pro Ser Lys lie Met Arg 
180 185 19" 

Lys Leu Glu Pro Glu Glu Phe Ala Ala Tyr Leu Glu Pro Phe Lys Glu 
195 200 205 

Lys Gly Glu Val Arg Arg Pro Thr Leu Ser Trp Pro Arg Glu He Pro 


215 


Leu val Lys Gly Gly Lys Pro Asp Val Val Gin He Val Arg Asn Tyr 
225 230 235 

Asn Ala Tyr Leu Arg Ala Ser Asp Asp Leu Pro Lys Met Phe lie Glu 
245 250 '^^^ 

ser ASP Pro Gly Phe Phe Ser Asn Ala He Val Glu Gly Ala Lys Lys 
260 265 

Phe Pro Asn Thr Glu Phe Val Lys Val Lys Gly Leu His Phe Ser Gin 
275 280 285 


Arg Val Leu Lys Asn Glu Gin 
305 310 

<210> 39 
<211> 1196 
<212> DNA 

<213> Renilla reniformia 

agcttaaaga tgacttcgaa agtttatgat ccagaacaaa ggaaaoggat gataactggt 
cogcagtggt gggccagatg taaacaaatg aatgttcttg attcatttat taattattat 
gattcagaaa aacatgcaga aaatgctgtt atttttttao atggtaacgc ggcctcttct 
tatttatggc gacatgttgt gccacatatt gagccagtag cgcggtgtat tataccagat 
ottattggta tgggcaaatc aggcaaatct ggtaatggtt cttataggtt acttgatcat 
tacaaatatc ttactgcatg gtttgaactt cttaatttac caaagaagat catttttgtc 
ggccatgatt ggggtgcttg tttggcattt cattatagct atgagoatca agataagatc 
aaagcaatag ttcacgctga aagtgtagta gatgtgattg aatcatggga tgaatggcct 
gatattgaag aagatattgc gttgatcaaa tctgaagaag gagaaaaaat ggttttggag 
aataacttct togtggaaac catgttgcca tcaaaaatca tgagaaagtt agaaccagaa 
gaatttgcag catatcttga accattcaaa gagaaaggtg aagttcgtcg tccaacatta 
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tcatggcctc 

gtgaaatccc 

gttagtaaaa 

ggtggtaaac ctgacgttgt 

acaaattgtt 

720 

aggaattata 

atgcttatct 

acgtgcaagt 

gatgatttac caaaaatgtt 

tattgaatcg 

780 

gatccaggat 

tcttttccaa 

tgctattgtt 

gaaggcgcca agaagtttcc 

taatactgaa 

840 

tttgtcaaag 

taaaaggtct 

tcatttttcg 

caagaagatg cacctgatga 

aatgggaaaa 

900 

tatatcaaat 

cgttcgttga 

gcgagttctc 

aaaaatgaac aataattact 

ttggtttttt 

960 

atttacattt 

ttcccgggtt 

taataatata 

aatgtcattt tcaacaattt 

tattttaact 

1020 

gaatatttca 

cagggaacat 

tcatatatgt 

tgattaattt agctcgaact 

ttactctgtc 

1080 

atatcatttt 

ggaatattac 

ctctttcaat 

gaaactttat aaacagtggt 

tcaattaatt 

1140 

aatatatatt 

ataattacat 

ttgttatgta 

ataaactcgg ttttattata 

aaaaaa 

1196 


<210> 40 

<211> 1360 

<212> PRT 

<213> Homo sapiens 

<400> 40 

Met Ser Arg Gin Ser Thr Leu Tyr Ser Phe Phe Pro Lys Ser Pro Ala 
15 10 15 

Leu Ser Asp Ala Asn Lys Ala Ser Ala Arg Ala Ser Arg Glu Gly Gly 
20 25 30 

Arg Ala Ala Ala Ala Pro Gly Ala Ser Pro Ser Pro Gly Gly Asp Ala 
35 40 45 

Ala Tip Ser Glu Ala Gly Pro Gly Pro Arg Pro Leu Ala Arg Ser Ala 
50 55 60 

Ser Pro Pro Lys Ala Lys Asn Leu Asn Gly Gly Leu Arg Arg Ser Val 


Ala Pro Ala Ala Pro Thr Ser Cys Asp Phe Ser Pro Gly Asp Leu Val 
85 90 95 


Gin Arg Ala Asp Glu Ala Leu Asn Lys Asp Lys lie Lys Arg Leu Glu 
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Leu Ala Val Cys Asp Glu Pro Ser Glu Pro Glu Glu Glu Glu Glu Met 
195 200 

Glu val Gly Thr Thr Tyr Val Thr Asp Lys Ser Glu Glu Asp Asn Glu 
210 215 220 

lie Glu ser Glu Glu Glu Val Gin Pro Lys Thr Gin Gly Ser Arg Arg 
225 230 235 

ser ser Arg Gin He Lys Lys Arg Arg Val He Ser Asp Ser Glu Ser 
245 250 -i^^ 

ASP He Gly Gly ser Asp Val Glu Phe Lys Pro Asp Thr Lys Glu Glu 
260 265 

Gly ser Ser Asp Glu He Ser Ser Gly Val Gly Asp Ser Glu Ser Glu 
275 280 285 

Gly Leu Asn Ser Pro Val Lys Val Ala Arg Lys Arg Lys Arg Met Val 
290 295 300 

Thr Gly Asn Gly Ser Leu Lys Arg Lys Ser Ser Arg Lys Glu Thr Pro 
305 

ser Ala Thr Lys Gin Ala Thr Ser He Ser Ser Glu Thr Lys Asn Thr 
325 330 •^■^^ 

Leu Arg Ala Phe Ser Ala Pro Gin Asn Ser Glu Ser Gin Ala His Val 
340 3*5 -33" 

ser Gly Gly Gly Asp Asp Ser Ser Arg Pro Thr Val Trp Tyr His Glu 
355 350 -^"^ 

Thr Leu Glu Trp Leu Lys Glu Glu Lys Arg Arg Asp Glu His Arg Arg 
370 375 380 

Arg pro Asp His Pro Asp Phe Asp Ala Ser Thr Leu Tyr Val Pro Glu 
385 390 

ASP Phe Leu Asn Ser Cys Thr Pro Gly Met Arg Lys Trp Trp Gin He 
405 ^-L^ 

Lys ser Gin Asn Phe Asp Leu Val He Cys Tyr Lys Val Gly Lys Phe 
420 425 ^-^^ 

Tyr Glu Leu Tyr His Met Asp Ala Leu He Gly Val Ser Glu Leu Gly 
435 440 

Leu val Phe Met Lys Gly Asn Trp Ala His Ser Gly Phe Pro Glu He 
450 
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Ser Thr Glu Gly Thr Leu Leu Glu Arg Val Asp Thr Cys His Thr Pro 
755 760 765 

Phe Gly Lys Arg Leu Leu Lys Gin Trp Leu Cys Ala Pro Leu Cya Asn 
•770 775 780 

His Tyr Ala lie Asn Asp Arg Leu Asp Ala lie Glu Asp Leu Met Val 
785 790 795 800 

val Pro Asp Lys He Ser Glu Val Val Glu Leu Leu Lys Lys Leu Pro 


Lys ser Gin Asn His Pro Asp Ser Arg Ala He Met Tyr Glu Glu Thr 
835 840 845 

Thr Tyr Ser Lys Lys Lys He He Asp Phe Leu Ser Ala Leu Glu Gly 


Phe Lys val Met Cys Lys He He Gly He Met Glu Glu Val Ala Asp 
865 870 875 880 

Gly Phe Lys Ser Lys He Leu Lys Gin Val He Ser Leu Gin Thr Lys 


Asn Pro Glu Gly Arg Phe Pro Asp Leu Thr Val Glu Leu Asn Arg Trp • 
900 905 910 

ASP Thr Ala Phe Asp His Glu Lys Ala Arg Lys Thr Gly Leu He Thr 
^ 915 920 925 

Pro Lvs Ala Gly Phe Asp Ser Asp Tyr Asp Gin Ala Leu Ala Asp He 
930 935 940 

Arg Glu Asn Glu Gin Ser Leu Leu Glu Tyr Leu Glu Lys Gin Arg Asn 
945 950 955 960 

Arg He Gly Cys Arg Thr He Val Tyr Trp Gly He Gly Arg Asn Arg 
965 970 9"5 

Tvr Gin Leu Glu He Pro Glu Asn Phe Thr Thr Arg Asn Leu Pro Glu 
980 985 990 

Glu Tyr Glu Leu Lys Ser Thr Lys Lys Gly Cys Lys Arg Tyr Trp Thr 

■' 1 nnn 1005 


Lys Thr He Glu Lys Lys Leu Ala Asn Leu He Asn Ala Glu Glu 
1010 1015 1020 

Arq Arq Asp Val Ser Leu Lys Asp Cys Met Arg Arg Leu Phe Tyr 
lols 1030 1035 
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Cys Pro Lys Ser Tyr Gly Phe Asn Ala Ala Arg Leu Ala Asn Leu 
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1295 


1300 


1305 
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Pro Glu Glu Val He Gin Lys Gly His Arg Lys Ala Arg Glu Phe 
1310 1315 13ZU 

Glu Lys Met Asn Gin Ser Leu Arg Leu Phe Arg Glu Val Cys Leu 


Ala Ser Glu Arg Ser Thr Val Asp Ala Glu Ala Val His Lys Leu 
1340 1345 J-Ja" 


<210> 41 

<211> 4264 

<212> DNA 

<213> Homo sapiens 


aacggttggg ccttgccggc tgtcggtatg tcgogacaga gcaccctgta cagcttcttc 120 
cccaagtctc cggcgctgag tgatgccaac aaggcctcgg ccagggcctc acgcgaaggc 180 
ggccgtgccg ccgctgoccc cggggcctct ccttccccag gcggggatgc ggcctggagc 
gaggctgggc ctgggcccag gcccttggcg cgatccgcgt caccgcccaa ggcgaagaac 
ctcaacggag ggctgcggag atcggtagcg cctgctgccc ooaccagttg tgacttctca 
ccaggagatt tggtttgggc caagatggag ggttacccct ggtggccttg tctggtttac 
aaccacccct ttgatggaac attcatccgc gagaaaggga aatcagtccg tgttcatgta 
cagttttttg atgacagccc aacaaggggo tgggttagca aaaggctttt aaagccatat 
aoaggttcaa aatcaaagga agcccagaag ggaggtcatt tttacagtgc aaagcctgaa 
atactgagag caatgcaacg tgcagatgaa gccttaaata aagacaagat taagaggctt 
gaattggcag tttgtgatga gccctcagag ccagaagagg aagaagagat ggaggtaggc 
acaacttacg taacagataa gagtgaagaa gataatgaaa ttgagagtga agaggaagta 
cagcctaaga cacaaggatc taggcgaagt agccgccaaa taaaaaaacg aagggtcata 
tcagattotg agagtgacat tggtggctct gatgtggaat ttaagccaga cactaaggag 
gaaggaagoa gtgatgaaat aagcagtgga gtgggggata gtgagagtga aggcctgaac 
agocctgtca aagttgctcg aaagcggaag agaatggtga ctggaaatgg ctctcttaaa 
aggaaaagct ctaggaagga aacgccctca gccaccaaac aagoaactag catttcatca 
gaaaccaaga atactttgag agctttctct gcccctcaaa attctgaatc ooaagcccac 
gttagtggag gtggtgatga cagtagtcgc cctactgttt ggtatcatga aactttagaa 
tggcttaagg aggaaaagag aagagatgag cacaggagga ggcctgatca ccccgatttt 
gatgcatcta cactctatgt gcctgaggat ttcctcaatt cttgtactcc tgggatgagg 
aagtggtggc agattaagtc tcagaacttt gatcttgtca tctgttacaa ggtggggaaa 1380 
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240 
300 


540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
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ttttatgagc 

tgtaccacat 

atgaaaggca 

actgggccca 

tccctggtgc 

agaagggcta 

atggaggcac 

gatgtagaaa 

gagatctgta 

ggatcattac 

tctgagaact 

ac:agtaagta 

catactcgtg 

catatggtgt 

cagttttcag 

atgatcgcca 

gtacaagttt 

tatttgaaaa 

tcattgtcct 

gttctcttca 

aaaactttga 

gaactctcct 

ggggtgatgt 

taccccaggt 

acaccaggag 

agaaaagtga 

aaaaaatgcc 

ttattgatca 

ttggattctg 

acacagtcag 

cgaatggtgc 

tagatgcagt 

ggttctactg 

aaggaaccct 

cggctcctaa 

agcaatggct 

ctagatgcca 

tagaagacct 

ctaaagaagc 

ttccagatct 

ctgaagagtc 

agaaccaccc 

aagaagaaga 

ttattgattt 

atagggatca 

tggaagaagt 

tctctgcaga 

caaaaaatcc 

tgggatacag 

cctttgacca 

ggctttgact 

ctgattatga 

ctggaatacc 

tagagaaaca 

attggtagga 

accgttacca 

gaagaatacg 

agttgaaatc 

gaaaagaagt 

tggctaatct 

tgcatgcggc 

gactgttcta 

gagtgtatcg 

cagtgttgga 

ggtcctatgt 

gtcgcccagt 

aaaggatcac 

gccatccttg 

gacattctaa 

taggctgtga 

gttactggac 

caaatatggg 
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ggatgctctt 

attggagtca 

gtgaactggg 

gctggtattc 

1440 

ttctggcttt 

cctgaaattg 

catttggcog 

ttattcagat 

1500 

taaagtagca 

cgagtggaac 

agactgagac 

tccagaaatg 

1560 

gatggcacat 

atatccaagt 

atgatagagt 

ggtgaggagg 

1620 

caagggtaca 

cagacttaca 

gtgtgctgga 

aggtgatccc 

1680 

tcttcttagc 

ctcaaagaaa 

aagaggaaga 

ttcttctggc 

1740 

gtgctttgtt 

gatacttcac 

tgggaaagtt 

tttcataggt 

1800 

ttgttcgaga 

tttaggactc 

tagtggcaca 

ctatccccca 

1860 

aggaaatctc 

tcaaaggaaa 

ctaaaacaat 

tctaaagagt 

1920 

ggaaggtctg 

atacccggct 

cccagttttg 

ggatgcatcc 

1980 

tgaggaagaa 

tattttaggg 

aaaagctaag 

tgatggcatt 

2040 

gcttaaaggt 

atgacttcag 

agtctgattc 

cattgggttg 

2100 

attggccctc 

tctgctctag 

gtggttgtgt 

cttctacctc 

2160 

ggagctttta 

tcaatggcta 

attttgaaga 

atatattccc 

2220 

cactacaaga 

tctggtgcta 

tcttcaccaa 

agcctatcaa 

2280 

gacattaaac 

aacttggaga 

tttttctgaa 

tggaacaaat 

2340 

actagagagg 

gttgatactt 

gccatactcc 

ttttggtaag 

2400 

ttgtgcccca 

ctctgtaacc 

attatgctat 

taatgatcgt 

2460 

catggttgtg 

cctgacaaaa 

tctccgaagt 

tgtagagctt 

2520 

tgagaggcta 

ctcagtaaaa 

ttcataatgt 

tgggtctccc 

2580 

agacagcagg 

gctataatgt 

atgaagaaac 

tacatacagc 

2540 

tctttctgct 

ctggaaggat 

tcaaagtaat 

gtgtaaaatt 

2700 

tgctgatggt 

tttaagtcta 

aaatccttaa 

gcaggtcatc 

2760 

tgaaggtcgt 

tttcctgatt 

tgactgtaga 

attgaaccga 

2820 

tgaaaaggct 

cgaaagactg 

gacttattac 

tcccaaagca 

2880 

ccaagctctt 

gctgacataa 

gagaaaatga 

acagagcctc 

2940 

gcgcaacaga 

attggctgta 

ggaccatagt 

ctattggggg ' 

3000 

gctggaaatt 

cctgagaatt 

tcaccactcg 

caatttgcca 

3060 

taccaagaag 

ggctgtaaac 

gatactggac 

caaaactatt 

3120 

cataaatgct 

gaagaacgga 

gggatgtatc 

attgaaggac 

3180 

taactttgat 

aaaaattaca 

aggactggca 

gtctgctgta 

3240 

tgttttactg 

tgcctggcta 

actatagtcg 

agggggtgat 

3300 

aattctgttg 

ccggaagata 

cccccccctt 

cttagagctt 

3360 

cattacgaag 

actttttttg 

gagatgattt 

tattcctaat 

3420 

ggaagaggag 

caggaaaatg 

gcaaagccta 

ttgtgtgctt 

3480 

gggcaagtct 

acgcttatga 

gacaggctgg 

cttattagct 

3540 
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gtaatggccc agatgggttg ttacgtccct gctgaagtgt gcaggctcac accaattgat 3600 

agagtgttta ctagacttgg tgcctoagac agaataatgt caggtgaaag tacatttttt 3660 

gttgaattaa gtgaaactgc cagcatactc atgcatgcaa cagcacattc tctggtgctt 3720 

gtggatgaat taggaagagg tactgcaaca tttgatggga cggcaatagc aaatgcagtt 3780 

gttaaagaac ttgctgagac tataaaatgt cgtacattat tttcaactca ctaccattca 3840 

ttagtagaag attattctca aaatgttgct gtgcgcctag gacatatggc atgcatggta 3900 

gaaaatgaat gtgaagaccc oagccaggag actattacgt tcctctataa attcattaag 3960 

ggagcttgtc ctaaaagcta tggctttaat gcagcaaggc ttgctaatct cccagaggaa 4020 
gttattcaaa agggacatag aaaagcaaga gaatttgaga agatgaatca gtcactacga 4080 
ttatttcggg aagtttgcct ggctagtgaa aggtcaactg tagatgctga agctgtccat 4140 
aaattgctga ctttgattaa ggaattatag aotgactaca ttggaagctt tgagttgact 4200 
tctgaccaaa ggtggtaaat tcagacaaca ttatgatcta ataaaottta ttttttaaaa 4260 


atga 


<210> 42 

<211> 389 

<212> PRT 

<213> Homo sapiens 


4264 


<400> 42 

Met Ala Gin Pro Lys Gin Glu Arg Val Ala Arg Ala Arg His Gin Arg 
1 5 

Ser Glu Thr Ala Arg His Gin Arg Ser Glu Thr Ala Lys Thr Pro Thr 
20 25 

Leu Gly Asn Arg Gin Thr Pro Thr Leu Gly Asn Arg Gin Thr Pro Arg 

Leu Gly lie His Ala Arg Pro Arg Arg Arg Ala Thr Thr Ser Leu Lau 
50 

Thr Leu Leu Leu Ala Phe Gly Lys Asn Ala Val Arg Cys Ala Leu Ila 

Gly Pro Gly Ser Leu Thr Ser Arg Thr Arg Pro Leu Thr Glu Pro Leu 

Gly Glu Lys Glu Arg Arg Glu Val Phe Phe Pro Pro Arg Pro Glu Arg 
100 

val Glu His Asn val Glu Ser Ser Arg Trp Glu Pro Arg Arg Arg Gly 
115 ^20 


Ala cys Gly ser Arg Gly Gly Aan Phe Pro Sar Pro Arg Gly Gly Ser 
130 135 
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Gly Val Ala Ser Leu Glu Arg Ala Glu Asn Ser Ser Thr Glu Pro Ala 
145 150 155 160 


Glu Glu Glu Asn Phe Glu Gly Phe Thr Leu Lys His His Thr Cys Lys 
225 230 235 240 


Phe Cys Arg Asp Cys Gin Phe Pro Glu Ala Ser Pro Ala Met Leu Pro 
340 345 350 


<210> 43 

<211> 1408 

<212> DNA 

<213> Homo sapiens 

<400> 43 

ggcgctccta cctgcaagtg gctagtgcca agtgctgggc cgccgctcct gccgtgcatg 60 
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ttggggagcc agtacatgca ggtgggctcc acacggagag gggcgcagac ccggtgacag 120 
ggctttacct ggtacatcgg catggcgcaa ccaaagcaag agagggtggc gcgtgccaga 180 
caccaacggt cggaaaccgc cagacaccaa cggtcggaaa ccgccaagac accaacgctc 240 
ggaaaccgcc agacaccaac gctcggaaac cgccagacac caaggctcgg aatccacgcc 300 
aggccacgac ggagggcgac tacctccctt ctgaccctgc tgctggcgtt cggaaaaaac 360 
gcagtccggt gtgctctgat tggtccaggc tctttgacgt cacggactcg acotttgaca 420 
gagccactag gcgaaaagga gagacgggaa gtattttttc cgccccgccc ggaaagggtg 480 
gagcacaacg tcgaaagcag ccgttgggag ccoaggaggc ggggogcctg tgggagccgt 540 
ggagggaact ttcccagtcc ccgaggcgga tccggtgttg catccttgga gcgagctgag 
aactcgagta cagaacctgc taaggccatc aaacctattg atcggaagtc agtccatcag 
atttgctctg ggccggtggt accgagtcta aggccgaatg cggtgaagga gttagtagaa 
aacagtctgg atgotggtgc cactaatgtt gatctaaagc ttaaggacta tggagtggat 
ctcattgaag tttcaggcaa tggatgtggg gtagaagaag aaaacttcga aggctttact 
ctgaaacatc acacatgtaa gattcaagag tttgccgaoc taactcaggt ggaaactttt 
ggctttcggg gggaagctct gagctcactt tgtgcactga gtgatgtcac catttctacc 
tgccgtgtat cagcgaaggt tgggactcga ctggtgtttg atcactatgg gaaaatcatc 
cagaaaaccc cctacccccg ccccagaggg atgacagtca gcgtgaagca gttattttct 
acgctacctg tgcaccataa agaatttcaa aggaatatta agaagaaacg tgcctgcttc 
cccttcgcct tctgccgtga ttgtcagttt cctgaggcct ccccagccat gcttcctgta 
cagcctgtag aactgactcc tagaagtacc ccaccccacc cctgctcctt ggaggacaac 12 60 
gtgatcactg tattcagctc tgtcaagaat ggtccaggtt cttctagatg atctgcacaa 1320 
atggttccto tcctccttcc tgatgtctgc cattagcatt ggaataaagt tcctgctgaa 
aatccaaaaa aaaaaaaaaa aaaaaaaa 

<210> 44 

<211> 264 

<212> PRT 

<213> Homo sapiens 

<400> 44 

Met Cys Pro Trp Arg Pro Arg Leu Gly Arg Arg Cys Met Val Ser Pro 
1 5 10 15 

Arg Glu Ala Asp Leu Gly Pro Gin Lys Asp Thr Arg Leu Asp Leu Pro 
20 25 30 

Arg Ser Pro Ala Arg Ala Pro Arg Glu Gin Asn Ser Leu Gly Glu Val 


Asp Arg Arg Gly Pro Arg Glu Gin Thr Arg Ala Pro Ala Thr Ala Ala 
50 55 60 
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660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 


1380 
1408 


BNSDOCID: <\NO 0308243SA1 _l_> 
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Pro Pro Arg Pro Leu Gly Ser Arg Gly Ala Glu Ala Ala Glu Pro Gin 


Glu Gly Leu Ser Ala Thr Val Ser Ala Cys Phe Gin Glu Gin Gin Glu 


Phe Ser Ser Glu Thr Ser His Met 
260 

<210> 45 

<211> 1785 

<212> DNA 

<213> Homo sapiens 

60 

gcaagaacag cttaagacca gtcagtggtt gctcctaccc attcagtggc ctgagcagtg 120 
gggagctgca gaccagtctt ccgtggcagg ctgagcgctc cagtcttcag tagggaattg 18 0 
ctgaataggc acagagggca cctgtacacc ttcagaccag tctgcaacct caggctgagt 24 0 
agcagtgaac tcaggagcgg gagcagtcca ttcaccctga aattcctcct tggtcactgc 300 
cttctcagca gcagcctgct cttctttttc aatctcttca ggatctctgt agaagtacag 360 
atcaggcatg acctcccatg ggtgttcacg ggaaatggtg ccacgcatgc gcagaacttc 420 


ccgagccagc atccaccaca ttaaacccac tgagtgagct cccttgttgt tgcatgggat 
ggcaatgtcc acatagcgca gaggagaatc tgtgttacac agcgcaatgg taggtaggtt 
aacataagat gcctcogtga gaggcgaagg ggcggcggga cccgggcctg gcccgtatgt 
gtccttggcg gcctagacta ggccgtcgct gtatggtgag ccccagggag gcggatctgg 
gcocccagaa ggacacocgc ctggatttgc cccgtagccc ggcccgggcc cctcgggagc 
agaacagcct tggtgaggtg gacaggaggg gacctcgcga gcagacgcgc gcgccagcga 
cagcagcccc gccccggcct ctcgggagcc ggggggcaga ggctgcggag ccccaggagg 
gtctatcagc cacagtctct gcatgtttcc aagagcaaca ggaaatgaac acattgcagg 
ggccagtgtc attcaaagat gtggotgtgg atttcaccca ggaggagtgg cggcaactgg 
accctgatga gaagatagca tacggggatg tgatgttgga gaactacagc catctagttt 
ctgtggggta tgattatcac caagccaaac atcatcatgg agtggaggtg aaggaagtgg 
agoagggaga ggagccgtgg ataatggaag gtgaatttcc atgtcaacat agtccagaac 
ctgctaaggc catcaaacct attgatcgga agtcagtcca tcagatttgc tctgggccag 
tggtactgag tctaagcact gcagtgaagg agttagtaga aaacagtctg gatgctggtg 1260 
ccactaatat tgatctaaag cttaaggact atggagtgga tctcattgaa gtttcagaca 
atggatgtgg ggtagaagaa gaaaactttg aaggottaat ctctttcagc tctgaaacat 
cacacatgta agattcaaga gtttgccgac ctaactgaag ttgaaacttt cggttttcag 
ggggaagctc tgagctcact gtgtgcactg agcgatgtca ccatttctac ctgccacgcg 1500 
ttggtgaagg ttgggactcg actggtgttt gatcacgatg ggaaaatcat coaggaaacc 
ccctaccccc accccagagg gaccacagtc agcgtgaagc agttattttc tacgctacct 
gtgcgccata aggaatttca aaggaatatt aagaagaogt gcctgcttcc ccttogcctt 
otgccgtgat tgtcagtttc ctgaggcctc cccagccatg cttcctgtac agcctgcaga 
actgtgagtc aattaaacct cttttcttca taaattaaaa aaaaa 


<210> 46 

<211> 583 

<212> PRT 

<213> Artificial Sequence 


600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 


1320 
1380 
1440 


1560 
1620 
1680 
1740 
1785 


<400> 46 

Met Lys Lys Pro Glu Leu Thr Ala Thr Ser Val Glu Lys Phe Leu He 


Glu Ly£ 


Phe Asp Ser Val Ser Asp Leu Met Gin Leu Ser Glu Gly Glu 


Glu Ser Arg Ala Phe Ser Phe Asp Val Gly Gly Arg Gly Tyr Val Leu 
35 40 

Arg val Asn Ser Cys Ala Asp Gly Phe Tyr Lys Asp Arg Tyr Val Tyr 
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Arg His Phe Ala Ser Ala Ala Leu Pro lie Pro Glu Val Leu Asp lie 
S5 70 75 go 

Gly Glu Phe Ser Glu Ser Leu Thr Tyr Cys He Ser Arg Arg Ala Gin 


Gin Pro Val Ala Glu Ala Met Asp Ala He Ala Ala Ala Asp Leu Ser 
115 120 125 

Gin Thr Ser Gly Phe Gly Pro Phe Gly Pro Gin Gly He Gly Gin Tyr 


Thr Thr Trp Arg Asp Phe He Cys Ala He Ala Asp Pro His Val Tyr 
145 150 155 160 


His Leu Val His Ala Asp Phe Gly Ser Asn Asn Val Leu Thr Asp Asn 
195 200 205 


Ser Gin Tyr Glu Val Ala Asn He Phe Phe Trp Arg Pro Trp Leu Ala 

225 230 235 240 

Cys Met Glu Gin Gin Thr Arg Tyr Phe Glu Arg Arg His Pro Glu Leu 

245 250 255 

Ala Gly Ser Pro Arg Leu Arg Ala Tyr Met Leu Arg He Gly Leu Asp 

260 265 270 

Gin Leu Tyr Gin Ser Leu Val Asp Gly Asn Phe Asp Asp Ala Ala Trn 

275 280 I 285 

Ala Gin Gly Arg Cys Asp Ala He Val Arg Ser Gly Ala Gly Thr Val 

290 295 300 


Cys Val Glu Val Leu Ala Asp Ser Gly Asn Arg Arg Pro Ser Thr Arg 
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pro ASP Arg Glu Met Gly Glu Ala Asn Met Ser Lys Gly Glu Glu Leu 
340 

Phe Thr Gly Val Val Pro He Leu Val Glu Leu Asp Gly Asp Val His 
355 360 365 

Gly His Lys Phe Ser Val Arg Gly Glu Gly Glu Gly Asp Ala Asp Tyr 
370 ^''S 380 

Gly Lys Leu Glu He Lys Phe He Cys Thr Thr Gly Lys Leu Pro Val 
385 390 ^5*= 

Pro Trp Pro Thr Leu Val Thr Thr Leu Gly Tyr Gly He Leu Cys Phe 
405 410 

Ala Arg Tyr Pro Glu His Met Lys Met Asn Asp Phe Phe Lys Ser Ala 
420 425 

Met Pro Glu Gly Tyr He Gin Glu Arg Thr He Phe Phe Gin Asp Asp 
435 440 

Gly Lys Tyr Lys Thr Arg Gly Glu Val Lys Phe Glu Gly Asp Thr Leu 

val Asn Arg He Glu Leu Lys Gly Met Asp Phe Lys Glu Asp Gly Asn 
465 470 

He Leu Gly His Lys Leu Glu Tyr Asn Phe Asn Ser His Asn Val Tyr 
485 490 ^^"^ 

lie Met Pro Asp Lys Ala Asn Asn Gly Leu Lys Val Asn Phe Lys He 
500 

Arg His Asn He Glu Gly Gly Gly Val Gin Leu Ala Asp His Tyr Gin 
515 520 a-^a 

Thr Asn val Pro Leu Gly Asp Gly Pro Val Leu He Pro He Asn His 
530 535 =40 

Tyr Leu Ser Thr Gin Thr Ala He Ser Lys Asp Arg Asn Glu Thr Arg 
545 550 555 

ASP His Met val Phe Leu Glu Phe Phe Ser Ala Cys Gly His Thr His 
565 570 

Gly Met Asp Glu Leu Tyr Lys 


<210> 47 

<211> 895 

<212> PRT 

<213> Artificial Sequence 


<223> Chimera: Luc : 
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<400> 47 

Met Lys Lys Pro Glu Leu Thr Ala Thr Ser Val Glu Lys Phe Leu lie 


Glu Lys Phe Asp Ser Val Ser Asp Leu Met Gin Leu Ser Glu Gly Glu 


Glu Ser Arg Ala Phe Ser Phe Asp Val Gly Gly Arg Gly Tyr Val Leu 


Arg Val Asn Ser Cys Ala Asp Gly Phe Tyr Lys Asp Arg Tyr Val Tyr 


Arg His Phe Ala Ser Ala Ala Leu Pro lie Pro Glu Val Leu Asp lie 


Gly Glu Phe Ser Glu Ser Leu Thr Tyr Cys lie Ser Arg Arg Ala Gin 


Gly Arg lie Thr Ala Val lie Asp Trp Ser Glu Ala Met Phe Gly Asp 
210 215 220 
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Gin Leu Tyr Gin Ser Leu Val Asp Gly Asn Phe Asp Asp Ala Ala Trp 


Ala Gin Gly Arg Cys Asp Ala He Val Arg Ser Gly Ala Gly Thr Val 
290 295 300 


Cvs Val Glu Val Leu Ala Asp Ser Gly Asn Arg Arg Pro Ser Thr Arg 
325 330 335 

Pro Asp Arg Glu Met Gly Glu Ala Asn Met Glu Asp Ala Lys Asn He 
340 345 350 

Lvs Lys Gly Pro Ala Pro Phe Tyr Pro Leu Glu Asp Gly Thr Ala Gly 
355 360 365 

Glu Gin Leu His Lys Ala Met Lys Arg Tyr Ala Leu Val Pro Gly Thr 
370 375 380 

He Ala Phe Thr Asp Ala His He Glu Val Asn He Thr Tyr Ala Glu 
385 390 395 400 

Tyr Phe Glu Met Ser Val Arg Leu Ala Glu Ala Met Lys Arg Tyr Gly 


Leu Asn Thr Asn His Arg He Val Val Cys Ser Glu Asn Ser Leu Gin 
420 425 430 

Phe Phe Met Pro Val Leu Gly Ala Leu Phe He Gly Val Ala Val Ala 
435 440 445 

Pro Ala Asn Asp He Tyr Asn Glu Arg Glu Leu Leu Asn Ser Met Asn 
450 455 460 

He Ser Gin Pro Thr Val Val Phe Val Ser Lys Lys Gly Leu Gin Lys 


He Leu Asn Val Gin Lys Lys Leu Pro He He Gin Lys He He He 
485 490 495 

Met Asp Ser Lys Thr Asp Tyr Gin Gly Phe Gin Ser Met Tyr Thr Phe 
500 505 510 

Val Thr Ser His Leu Pro Pro Gly Phe Asn Glu Tyr Asp Phe Val Pro 


Glu Ser Phe Asp Arg Asp Lys Thr He Ala Leu He Met Asn Ser Ser 
530 535 540 

Glv Ser Thr Gly Leu Pro Lys Gly Val Ala Leu Pro His Arg Thr Ala 
545 550 555 560 
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Leu Met Tyr Arg Phe Glu Glu Glu Leu Phe Leu Arg Ser Leu Gin Asp 
610 615 620 

Tyr Lys He Gin Ser Ala Leu Leu Val Pro Thr Leu Phe Ser Phe Phe 
625 630 635 " 640 

Ala Lys Ser Thr Leu He Asp Lys Tyr Asp Leu Ser Asn Leu His Glu 
645 650 655 


Gly Ala Val Gly Lys Val Val Pro Phe Phe Glu Ala Lys Val Val Asp 
105 710 715 720 


Ser He Leu Leu Gin His Pro Asn He Phe Asp Ala Gly Val Ala Gly 
805 810 815 
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Ser Gin Val Thr Thr Ala Lys Lys 
850 855 


Asp Glu Val Pro Lys Gly Leu Thr 
865 870 


Arg Glu He Leu He Lys Ala Lys 
885 


Leu Arg Gly Gly Val Val Phe Val 
860 

Gly Lys Leu Asp Ala Arg Lys He 
875 880 

Lys Gly Gly Lys Ser Lys Leu 
890 895 
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